Cartesia

Supported service: tts
Key: cartesia
Integrated: Yes, but you may want to provide your own key to use your custom voices. See BYO Keys for more info.

Service options

voice

string

default:"79a125e8-cd45-4c13-8a67-188112f4dd22"

Initialize the voice for the TTS service. Select any voice from the available Cartesia voices.

{
  "service_options": {
    "cartesia": {
      "voice": "820a3788-2b37-4d21-847a-b65d8a68c99a"
    }
  }
}

sample_rate

integer

default:"24000"

The audio output sample rate in Hz for the TTS audio.

The sample rate must be one of these values: 8000, 16000, 22050, 24000, 44100, 48000.

{
  "service_options": {
    "cartesia": {
      "sample_rate": 24000
    }
  }
}

Configuration options

model

string

default:"sonic-english"

Select from the available Cartesia models.

For the best performance for English, we recommend using the "sonic-english" model. For multilingual, we recommend using the "sonic-multilingual" model.

{
  "name": "model",
  "value": "sonic-english"
}

voice

string

default:"79a125e8-cd45-4c13-8a67-188112f4dd22"

The voice you want your TTS service to use. Select any voice from the available Cartesia voices.

You can click the “Try it out” button on https://cartesia.ai/sonic to sign up for a free account and sample the built-in voices. You can also find voice IDs from their playground page.

{
  "name": "voice",
  "value": "820a3788-2b37-4d21-847a-b65d8a68c99a"
}

language

string

default:"en"

The language you want your TTS service to use. To select a non-English language, select the sonic-mulitlingual model and specify the language. Learn more.

{
  "name": "model",
  "value": "sonic-mulitlingual"
},
{
  "name": "language",
  "value": "es"
}

speed

string

The rate at which the text is spoken. Learn more.

Speed options include slowest, slow, normal, fast, and fastest.

For more granular control, you can define speed as a number within the range [-1.0, 1.0]. A value of 0 represents the default speed, while negative values slow down the speech and positive values speed it up.

{
  "name": "speed",
  "value": "fast"
}

emotion

string[]

The emotion parameter is an array of “tags” in the form emotion_name:level. For example, positivity:high or curiosity. Learn more.

Emotion names: anger, positivity, surprise, sadness, curiosity.

Emotion levels: lowest, low, (omit for moderate addition of emotion), high, highest.

{
  "name": "emotion",
  "value": ["positivity:high", "curiosity"]
}

text_filter

object

Control whether the TTS service filters out markdown, code blocks, or tables from its output.

Basic markdown filtering is enabled by default. Enable code and table filtering as needed. Filtering code and tables can help the TTS avoid mistakes.

{
  "name": "text_filter",
  "value": {
    "enable_text_filter": true,
    "filter_code": true,
    "filter_tables": true
  }
}

Client Reference

Server Reference

Services

Recording

Phone Numbers

Twilio Websocket

Service options

Configuration options

Client Reference

Server Reference

Services

Recording

Phone Numbers

Twilio Websocket

​Service options

​Configuration options

Service options

Configuration options