Gemini Multimodal Live

Supported service: llm
Key: gemini_live
Integrated: No. See BYO Keys for more info.

Service options

model

string

default:"models/gemini-2.0-flash-exp"

The model that will complete your prompt.

{
  "service_options": {
    "gemini_live": {
      "model": "models/gemini-2.0-flash-exp"
    }
  }
}

voice

string

default:"Charon"

The voice you want your speech-to-speech service to use. Voices supported are: Aoede, Charon, Fenrir, Kore, Puck. See the docs for the latest information.

{
  "service_options": {
    "gemini_live": {
      "voice": "Puck"
    }
  }
}

Configuration options

frequency_penalty

float

default:"0"

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.

{
  "name": "frequency_penalty",
  "value": 1.1
}

model

string

default:"models/gemini-2.0-flash-exp"

The model that will complete your prompt.

{
  "name": "model",
  "value": "models/gemini-2.0-flash-exp"
}

presence_penalty

float

default:"0"

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.

{
  "name": "presence_penalty",
  "value": 0.9
}

temperature

float

default:"1.0"

Amount of randomness injected into the response.

Use temperature closer to the lower end of the range for analytical / multiple choice, and closer to the high end of the range for creative and generative tasks.

Note that even with temperature of 0.0, the results will not be fully deterministic. Range 0.0 to 2.0.

{
  "name": "temperature",
  "value": 0.9
}

top_k

integer

Only sample from the top K options for each subsequent token.

Used to remove “long tail” low probability responses.

Recommended for advanced use cases only. You usually only need to use temperature.

{
  "name": "top_k",
  "value": 42
}

top_p

float

Use nucleus sampling.

In nucleus sampling, we compute the cumulative distribution over all the options for each subsequent token in decreasing probability order and cut it off once it reaches a particular probability specified by top_p. You should either alter temperature or top_p, but not both. See the Gemini Multimodal Live docs for more information.

Recommended for advanced use cases only. You usually only need to use temperature.

{
  "name": "top_p",
  "value": 0.5
}

extra

object

A dictionary that can contain any additional parameters supported by Gemini that you want to pass to the API. Refer to the Gemini Multimodal Live reference docs for more information on each of these configuration options.

{
  "name": "extra",
  "value": {
    "temperature": 1.2,
    "max_tokens": 4000
  }
}

Client Reference

Server Reference

Services

Recording

Phone Numbers

Twilio Websocket

Service options

Configuration options

Client Reference

Server Reference

Services

Recording

Phone Numbers

Twilio Websocket

​Service options

​Configuration options

Service options

Configuration options