The RTVIClient takes a list of services along with the JSON configuration used to set up those services. As an open source component, the services and their configurations is vastly flexible. However, Daily Bots has a pre-defined set of services that can be used and keys to configure them.

The full set of configuration options is detailed in the RTVI API Reference. Also, different services used in your Daily Bots have different capablities. Check out the supported services to learn about available services. Then, view each service page for a full list of available configuration options.

Services

Each bot profile expects a specific set of services to be configured. The most common configuration is to have three services, speech-to-text (STT), LLM, and text-to-speech (TTS), which can be configured as follows:

{
  "services": {
    "stt": "<stt service name>",
    "tts": "<tts service name>",
    "llm": "<llm service name>"
  }
}

For the list of accepted service name values, see the supported services page. Also, see the bot profiles for a list of available profiles.

Configuration

The configuration object is where you set various options for your services. This should be sent as a part of your RTVIClient constructor but can also be updated at any time using updateConfig().

The config is a list because order matters; configurations are applied sequentially. For example, to configure a TTS service with a new voice and an LLM service with new prompting, specify the TTS first to ensure the voice is applied before the prompting. This ordering principle also applies to service options, ensuring deterministic outcomes for all RTVI implementations.

For Daily Bots you must provide a configuration options for both your llm and tts service. Optionally, you can also specify an stt service, which provides control over model and language parameters. The general format for the configuration object is as follows:

{
  "config": [
    {
      "service": <service>, // "tts" || "llm" || "stt"
      "options": [
        {
          "name": "<option name>",
          "value": "<option value>"
        }
      ]
    },
    {
      "service": <service>, // "tts" || "llm" || "stt"
      "options": [
        {
          "name": "<option name>",
          "value": "<option value>"
        }
      ]
    },
    {
      "service": <service>, // "tts" || "llm || "stt"
      "options": [
        {
          "name": "<option name>",
          "value": "<option value>"
        }
      ]
    }
  ]
}

Speech-to-text services

Speech-to-text (STT) services are responsible for transcribing user text, which is then typically passed to the LLM service.

Find available TTS services on the supported services page.

STT configuration

The main considerations for STT services are the model and language parameters. The default language is set to English, but you can modify this by setting the language value. Commonly the model and language parameters have dependencies, so be sure to select model and language values that are supported by the service you are using.

Each STT service has its own set of configuration options. Visit the STT service’s page for a full list of available configuration options.

Text-to-speech services

Text-to-speech (TTS) services are responsible for converting the bot’s text responses into speech.

Find available TTS services on the supported services page.

TTS configuration

The main consideration for TTS services is the voice option. Depending on the service you are using, you may also have an option for model. The values for voice and model are defined by the service you are using.

For additional configuration options like speed, emotion, stability, and more, visit the TTS service’s page for a full list of available configuration options.

LLM services

LLM services are responsible for generating text responses based on the user’s input. The options and format of those options are generally defined by the service and model you are using. Each option expects a "name" and "value" field, allowing the bot to dynamically apply the option to any llm and model.

Find available LLM services on the supported services page.

Core functionality

For Daily Bots, there are a few options that are required or commonly used detailed below:

model
string
required

The model you want to use for the LLM service. This is a required field. When using an integrated service without an API key, the model string must match one of the options outlined in the supported services page.

Example:

{
  "name": "model",
  "value": "claude-3-5-sonnet-20240620"
}
initial_messages
Array[LLM messages]
required

The initial set of messages to prompt the LLM service with. This is a required field when providing the first configuration. For subsequent configuration updates, use the “messages” option. The format of the messages is defined by the service and model you are using, but generally contains setting a role and content.

Examples:

messages
Array[LLM messages]
required

Same as initial_messages, but used for subsequent configuration updates. This is a required field when providing a configuration update. For the first configuration, use the “initial_messages” option.

run_on_config
bool
default: "false"

run_on_config is a boolean field that forces the bot to talk first. Without this setting, the bot will not begin speaking until the user does. This is an optional field and defaults to false.

IMPORTANT: This field typically should be listed last in your configuration to ensure the bot does not start speaking before it receives its initial messages. Otherwise, fun bot hallucinations may occur.

enable_prompt_caching
bool

Currently only works with Anthropic Claude 3.5 Sonnet and Claude 3 Haiku

Setting this field to true will enable Anthropic’s prompt caching feature. This feature allows the bot to remember the last prompt it received and use it as a starting point for the next prompt. This is an optional field and defaults to false.

tools
Array[Tool Definition]

Currently only works with Anthropic and OpenAI

This field describes to the LLM all the tools it has access to and how to call them. This feature is also referred to as “function calling”. The format for describing each tool is highly dependent on the service, but typically require you to give your tool a name, a description and an object detailing the set of parameters your tool expects to receive. For more information on setting up tool calling, see our full tutorial. This is an optional field and defaults to an empty array.

Examples:

Configuration options

Each LLM service has its own set of configuration options for parameters like temperature, max_tokens, top_p, and more. Visit the LLM service’s page for a full list of available configuration options.