Configuration
Defining the services and their configurations for Daily Bots
The RTVIClient
takes a list of services along with the JSON configuration used to set up those services. As an open source component, the services and their configurations is vastly flexible. However, Daily Bots has a pre-defined set of services that can be used and keys to configure them.
The full set of configuration options is detailed in the Pipecat client SDK API Reference. Also, different services used in your Daily Bots have different capabilities. Check out the supported services to learn about available services. Then, view each service page for a full list of available configuration options.
Services
Each bot profile expects a specific set of services to be configured. The most common configuration is to have three services, speech-to-text (STT), LLM, and text-to-speech (TTS), which can be configured as follows:
For the list of accepted service name values, see the supported services page. Also, see the bot profiles for a list of available profiles.
Configuration
The configuration object is where you set various options for your services. This should be sent as a part of your RTVIClient
constructor but can also be updated at any time using updateConfig()
.
The config is a list because order matters; configurations are applied sequentially. For example, to configure a TTS service with a new voice and an LLM service with new prompting, specify the TTS first to ensure the voice is applied before the prompting. This ordering principle also applies to service options, ensuring deterministic outcomes for all RTVI implementations.
For Daily Bots you must provide a configuration options for both your llm
and tts
service. Optionally, you can also specify an stt
service, which provides control over model
and language
parameters. The general format for the configuration object is as follows:
Speech-to-text services
Speech-to-text (STT) services are responsible for transcribing user text, which is then typically passed to the LLM service.
Find available TTS services on the supported services page.
STT configuration
The main considerations for STT services are the model
and language
parameters. The default language is set to English, but you can modify this by setting the language
value. Commonly the model
and language
parameters have dependencies, so be sure to select model
and language
values that are supported by the service you are using.
Each STT service has its own set of configuration options. Visit the STT service’s page for a full list of available configuration options.
Text-to-speech services
Text-to-speech (TTS) services are responsible for converting the bot’s text responses into speech.
Find available TTS services on the supported services page.
TTS configuration
The main consideration for TTS services is the voice
option. Depending on the service you are using, you may also have an option for model
. The values for voice
and model
are defined by the service you are using.
For additional configuration options like speed
, emotion
, stability
, and more, visit the TTS service’s page for a full list of available configuration options.
LLM services
LLM services are responsible for generating text responses based on the user’s input. The options and format of those options are generally defined by the service and model you are using. Each option expects a "name"
and "value"
field, allowing the bot to dynamically apply the option to any llm
and model
.
Find available LLM services on the supported services page.
Core functionality
For Daily Bots, there are a few options that are required or commonly used detailed below:
The model you want to use for the LLM service. This is a required field. When using an integrated service without an API key, the model string must match one of the options outlined in the supported services page.
Example:
The initial set of messages to prompt the LLM service with. This is a required field when providing the first configuration. For subsequent configuration updates, use the “messages” option. The format of the messages is defined by the service and model you are using, but generally contains setting a role
and content
.
Examples:
Anthropic Messages API
Full reference documentation for Anthropic’s messages API.
Together AI Messages API
Full reference documentation for Together’s message API.
Groq Messages API
Full reference documentation for Groq’s message API.
OpenAI Messages API
Full reference documentation for OpenAI’s message API.
Same as initial_messages
, but used for subsequent configuration updates.
This is a required field when providing a configuration update. For the first
configuration, use the “initial_messages” option.
run_on_config
is a boolean field that forces the bot to talk first. Without
this setting, the bot will not begin speaking until the user does. This is an
optional field and defaults to false
.
IMPORTANT: This field typically should be listed last in your configuration to ensure the bot does not start speaking before it receives its initial messages. Otherwise, fun bot hallucinations may occur.
Currently only works with Anthropic Claude 3.5 Sonnet and Claude 3 Haiku
Setting this field to true will enable Anthropic’s prompt caching feature. This feature allows the bot to remember the last prompt it received and use it as a starting point for the next prompt. This is an optional field and defaults to false
.
Currently only works with Anthropic and OpenAI
This field describes to the LLM all the tools it has access to and how to call them. This feature is also referred to as “function calling”. The format for describing each tool is highly dependent on the service, but typically require you to give your tool a name, a description and an object detailing the set of parameters your tool expects to receive. For more information on setting up tool calling, see our full tutorial. This is an optional field and defaults to an empty array.
Examples:
Configuration options
Each LLM service has its own set of configuration options for parameters like temperature
, max_tokens
, top_p
, and more. Visit the LLM service’s page for a full list of available configuration options.