DailyVoiceClient
The client-side component that speaks to the Daily Bot
Overview
The DailyVoiceClient
is the component you will primarily interface with for interacting with your Daily Bot. This client is part of the RTVI suite of client libraries, all of which follow the same API design detailed below. The purpose of this component is to:
- Provides a start() method that handshakes / authenticates with your bot service.
- Configures your bot services.
- Manages your media connections.
- Provides methods, callbacks and events for interfacing with your bot
The DailyVoiceClient
is an extention of the RTVI VoiceClient
, using Daily’s transport for voice and video communication with the bot. A Daily transport is required for Daily Bots. This page provides details on the most common parameters, methods, and callbacks you will use with the DailyVoiceClient
along with specific API expectations required for Daily Bots. For full reference material and installation instructions, visit the docs for your corresponding library.
Source Docs
API Reference
Constructor Parameters
Handshake URL to your hosted endpoint that triggers authentication, transport session creation and bot instantiation.
The VoiceClient
will send a JSON POST
request to this URL and pass the local configuration (config
) as a body param.
The DailyVoiceClient
expects this endpoint to return the response from the Daily Bots endpoint to establish the connection. Doing so will then established the connection automatically.
A key value object service registration, each representing an available service (such as OpenAI, ElevenLabs, or a local LLM etc) that is made available in your bot file.
While the keys provided can technically be anything, the Daily Bots endpoint expects an "llm"
and "tts"
, each with a value matching what’s currently supported.
Example:
{
"llm": "anthropic",
"tts": "cartesia"
}
Pipeline configuration object for your registered services. Must contain a valid VoiceClientConfig
array.
Client config is passed to the bot at startup, and can be overriden in your server-side endpoint (where sensitive information can be provided, such as API keys.)
See configuration
An array of callback functions. See: callbacks
Enable user’s local microphone device.
Enable user’s local webcam device.
Custom HTTP headers to include in the initial start
web request to the
baseUrl
.
Pass through custom request properties to base URL as part of the start
method.
Methods
initDevices()
This method initializes the media device selection and allows user’s to test and switch media devices before starting the conversation.
start()
Sets up and starts the conversation.
disconnect()
Stops the conversation and tears down all network connections.
enableMic(enable: boolean)
Enables or disables the user’s microphone, based on provided enable.
enableCam(enable: boolean)
Enables or disables the user’s camera, based on provided enable.
tracks()
Returns all available MediaStreamTracks for the user and bot.
// Return type
{
local: {
audio?: MediaStreamTrack;
video?: MediaStreamTrack;
},
bot?: {
audio?: MediaStreamTrack;
video?: MediaStreamTrack;
}
}
Callbacks
botReady: Bot is connected and ready to receive messages
transcript: STT transcript (both local and remote) flagged with partial, final or sentence
config: Bot configuration
error: Bot initialization error
errorResponse: Error response from the bot in response to an action
configAvailable: Configuration options available on the bot
configUpdated: Configuration options have changed successfully
configError: Configuration options have changed failed
actionsAvailable: Actions available on the bot
metrics: RTVI reporting metrics
userTranscription: Local user speech to text
ttsText: Bot speech to text
userStartedSpeaking: User started speaking
userStoppedSpeaking: User stopped speaking
botStartedSpeaking: Bot started speaking
botStoppedSpeaking: Bot stopped speaking