Bot Profiles
A list of currently supported bot profiles and how to configure them
Bot Profile Overview
When making your /bots/start
request, you must include a bot_profile
. The bot_profile
corresponds to a pre-defined Pipecat pipeline of services. Daily Bots provides a number of bot profiles to choose from. Below is a list of currently supported bot profiles, the id to use for them, the services they require configurations for, and a description of what they do.
Bot Type | Configuration String | Description | Configurable Services |
---|---|---|---|
Voice AI (Standard) | "voice_2024_10" "voice_2024_08" | Sets up a voice-only conversational bot that listens and responds to the user. | STT, LLM, TTS |
Voice AI (Natural) | "natural_conversation_2024_11" | Sets up a voice-only conversational bot that listens and responds naturally to the user. | STT, LLM, TTS |
Vision And Voice | "vision_2024_10" "vision_2024_08" | Sets up a bot that listens to and can see the user so it can respond to the user and be informed of their surroundings. | STT, LLM, TTS |
Gemini Multimodal Live | "gemini_multimodal_live_2024_12" | Setup up a speech-to-speech, multimodal bot using Google Gemini’s Multimodal Live API. | LLM |
OpenAI Realtime | "openai_realtime_beta_2024_10" | Sets up a speech-to-speech bot using OpenAI’s Realtime API beta | LLM |
Twilio Websocket | "twilio_ws_voice_2024_09" | Sets up a voice-only conversational bot that communicates with the user over Twilio’s Websocket API. | STT, LLM, TTS |
Voice AI
The Voice AI bot profiles enable conversational AI interactions. The bot listens to the user, processes the input, and responds accordingly. These profiles come in two variants:
- Standard Voice AI: Basic conversational interactions
- Natural Voice AI: Enhanced natural dialogue with improved turn-taking and context awareness
Both bot profiles require configurations for Speech-to-Text (STT), Language Model (LLM), and Text-to-Speech (TTS) services.
Example Request
Vision And Voice
The vision to voice bot profile is a more advanced bot that can see the user and respond to them. This bot profile requires a camera to be attached to the client and the client to be able to send video frames to the bot. The bot will then use the video frames to inform its responses to the user. This bot profile requires the same services as the voice AI bot profile, but with the addition of a camera service. To enable the camera, simply set enableCam
to true
in your RTVIClient
and specify one of the Vision and Voice bot profile.
Example Request
Voice AI over Twilio WebSocket
Using the "twilio_ws_voice_2024_09"
bot profile, you can create a bot that communicates with the user over Twilio’s WebSocket API. This bot profile requires the same services as the Voice AI profiles, but with the addition of a Twilio WebSocket service. To learn about how to enable the Twilio WebSocket service, check out the tutorial.