Step 1: Local Development
Prerequisites
Environment- Python 3.10 or later
- uv package manager installed
Deepgram (STT)
Create an account and generate your API key for real-time
speech recognition.
OpenAI (LLM)
Create an account and generate an API key for intelligent conversation
responses.
Cartesia (TTS)
Sign up and generate your API key for natural voice
synthesis.
Setup
- Clone the quickstart repository
- Configure your API keys
.env
file in your text editor and add your API keys:
- Set up virtual environment and install dependencies
Run your bot locally
Now you’re ready to run your bot! Start it withFirst run note: The initial startup may take ~20 seconds as Pipecat
downloads required models and imports. Subsequent runs will be much faster.
Step 2: Deploy to Production
Transform your local bot into a production-ready service. Pipecat Cloud handles scaling, monitoring, and global deployment.Prerequisites
- Sign up for Pipecat Cloud
- Set up Docker
- Install Docker on your system
- Create a Docker Hub account
- Login to Docker Hub:
- Pipecat Cloud CLI
pipecatcloud
CLI is already installed with your quickstart project. We’ll use it as pcc
below to manage deployments and secrets.
Configure your deployment
Thepcc-deploy.toml
file tells Pipecat Cloud how to run your bot. Update the image field with your Docker Hub username:
agent_name
: Your bot’s name in Pipecat Cloudimage
: The Docker image to deploy (format:username/image:version
)secret_set
: Where your API keys are stored securelymin_agents
: Number of bot instances to keep ready (1 = instant start)
Set up
image_credentials
in your TOML file for authenticated image pulls.Configure secrets
Upload your API keys to Pipecat Cloud’s secure storage:quickstart-secrets
(matching your TOML file) and uploads all your API keys from .env
.
Build and deploy
- Build and push your Docker image
image
information in your pcc-deploy.toml
file.
- Deploy to Pipecat Cloud
Connect to your agent
- Open your Pipecat Cloud dashboard
- Select your
quickstart
agent → Sandbox - Allow microphone access and click Connect
Ready to scale?
Explore advanced Pipecat Cloud features like scaling, monitoring, secrets
management, and production best practices.
Understanding the Quickstart Bot
When you speak to your bot, here’s the real-time pipeline that processes your conversation:- Audio Capture: Your browser captures microphone audio and sends it via WebRTC
- Voice Activity Detection: Silero VAD detects when you start and stop speaking
- Speech Recognition: Deepgram converts your speech to text in real-time
- Language Processing: OpenAI’s GPT model generates an intelligent response
- Speech Synthesis: Cartesia converts the response text back to natural speech
- Audio Playback: The generated audio streams back to your browser
AI Services
Your bot uses three AI services, each configured with API keys from your.env
file:
Context and Messages
Your bot maintains conversation history using a context object, enabling multi-turn interactions where the bot remembers what was said earlier. The context is initialized with a system message that defines the bot’s personality:RTVI Protocol
When building web or mobile clients, you can use Pipecat’s client SDKs that communicate with your bot via the RTVI (Real-Time Voice Interaction) protocol. In our quickstart example, we initialize the RTVI processor to handle client-server messaging and events:Pipeline Configuration
The core of your bot is a Pipeline that processes data through a series of processors:Event Handlers
Event handlers manage the bot’s lifecycle and user interactions:Running the Pipeline
Finally, the pipeline is executed by a PipelineRunner:handle_sigint=False
because the main runner handles system signals.
Bot Entry Point
The quickstart uses Pipecat’s runner system:Production ready: This bot pattern is fully compatible with Pipecat Cloud,
meaning you can deploy your bot without any code changes.
Troubleshooting
- Browser permissions: Make sure to allow microphone access when prompted by your browser.
- Connection issues: If the WebRTC connection fails, first try a different browser. If that fails, make sure you don’t have a VPN or firewall rules blocking traffic. WebRTC uses UDP to communicate.
- Audio issues: Check that your microphone and speakers are working and not muted.