• Supported service: tts
  • Key: cartesia
  • Integrated: Yes, but you may want to provide your own key to use your custom voices. See BYO Keys for more info.

Service options

voice
string
default: "79a125e8-cd45-4c13-8a67-188112f4dd22"

Initialize the voice for the TTS service. Select any voice from the available Cartesia voices.

{
  "service_options": {
    "cartesia": {
      "voice": "820a3788-2b37-4d21-847a-b65d8a68c99a"
    }
  }
}
sample_rate
integer
default: "24000"

The audio output sample rate in Hz for the TTS audio.

The sample rate must be one of these values: 8000, 16000, 22050, 24000, 44100, 48000.

{
  "service_options": {
    "cartesia": {
      "sample_rate": 24000
    }
  }
}

Configuration options

model
string
default: "sonic-english"

Select from the available Cartesia models.

For the best performance for English, we recommend using the "sonic-english" model. For multilingual, we recommend using the "sonic-multilingual" model.

{
  "name": "model",
  "value": "sonic-english"
}
voice
string
default: "79a125e8-cd45-4c13-8a67-188112f4dd22"

The voice you want your TTS service to use. Select any voice from the available Cartesia voices.

You can click the “Try it out” button on https://cartesia.ai/sonic to sign up for a free account and sample the built-in voices. You can also find voice IDs from their playground page.

{
  "name": "voice",
  "value": "820a3788-2b37-4d21-847a-b65d8a68c99a"
}
language
string
default: "en"

The language you want your TTS service to use. To select a non-English language, select the sonic-mulitlingual model and specify the language. Learn more.

{
  "name": "model",
  "value": "sonic-mulitlingual"
},
{
  "name": "language",
  "value": "es"
}
speed
string

The rate at which the text is spoken. Learn more.

Speed options include slowest, slow, normal, fast, and fastest.

For more granular control, you can define speed as a number within the range [-1.0, 1.0]. A value of 0 represents the default speed, while negative values slow down the speech and positive values speed it up.

{
  "name": "speed",
  "value": "fast"
}
emotion
string[]

The emotion parameter is an array of “tags” in the form emotion_name:level. For example, positivity:high or curiosity. Learn more.

Emotion names: anger, positivity, surprise, sadness, curiosity.

Emotion levels: lowest, low, (omit for moderate addition of emotion), high, highest.

{
  "name": "emotion",
  "value": ["positivity:high", "curiosity"]
}
text_filter
object

Control whether the TTS service filters out markdown, code blocks, or tables from its output.

Basic markdown filtering is enabled by default. Enable code and table filtering as needed. Filtering code and tables can help the TTS avoid mistakes.

{
  "name": "text_filter",
  "value": {
    "enable_text_filter": true,
    "filter_code": true,
    "filter_tables": true
  }
}