Gemini Multimodal Live
- Supported service:
llm
- Key:
gemini_live
- Integrated: No. See BYO Keys for more info.
Service options
The model that will complete your prompt.
The voice you want your speech-to-speech service to use. Voices supported are: Aoede
, Charon
, Fenrir
, Kore
, Puck
. See the docs for the latest information.
Configuration options
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.
The model that will complete your prompt.
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.
Amount of randomness injected into the response.
Use temperature closer to the lower end of the range for analytical / multiple choice, and closer to the high end of the range for creative and generative tasks.
Note that even with temperature of 0.0, the results will not be fully deterministic. Range 0.0 to 2.0.
Only sample from the top K options for each subsequent token.
Used to remove “long tail” low probability responses.
Recommended for advanced use cases only. You usually only need to use temperature.
Use nucleus sampling.
In nucleus sampling, we compute the cumulative distribution over all the options for each subsequent token in decreasing probability order and cut it off once it reaches a particular probability specified by top_p. You should either alter temperature or top_p, but not both. See the Gemini Multimodal Live docs for more information.
Recommended for advanced use cases only. You usually only need to use temperature.
A dictionary that can contain any additional parameters supported by Gemini that you want to pass to the API. Refer to the Gemini Multimodal Live reference docs for more information on each of these configuration options.