Transports
Transports exchange audio and video streams between the user and bot.| Service | Setup |
|---|---|
| DailyTransport | pip install "pipecat-ai[daily]" |
| FastAPIWebSocketTransport | pip install "pipecat-ai[websocket]" |
| HeyGenTransport | pip install "pipecat-ai[heygen]" |
| LiveKitTransport | pip install "pipecat-ai[livekit]" |
| SmallWebRTCTransport | pip install "pipecat-ai[webrtc]" |
| TavusTransport | pip install "pipecat-ai[tavus]" |
| WebSocket Transports | pip install "pipecat-ai[websocket]" |
| WhatsAppTransport | pip install "pipecat-ai[webrtc]" |
Serializers
Serializers convert between frames and media streams, enabling real-time communication over a websocket.Speech-to-Text
Speech-to-Text services receive and audio input and output transcriptions.| Service | Setup |
|---|---|
| AssemblyAI | pip install "pipecat-ai[assemblyai]" |
| AWS Transcribe | pip install "pipecat-ai[aws]" |
| Azure | pip install "pipecat-ai[azure]" |
| Cartesia | pip install "pipecat-ai[cartesia]" |
| Deepgram | pip install "pipecat-ai[deepgram]" |
| ElevenLabs | pip install "pipecat-ai[elevenlabs]" |
| Fal Wizper | pip install "pipecat-ai[fal]" |
| Gladia | pip install "pipecat-ai[gladia]" |
pip install "pipecat-ai[google]" | |
| Gradium | pip install "pipecat-ai[gradium]" |
| Groq (Whisper) | pip install "pipecat-ai[groq]" |
| Hathora | pip install "pipecat-ai[hathora]" |
| NVIDIA | pip install "pipecat-ai[nvidia]" |
| OpenAI | pip install "pipecat-ai[openai]" |
| SambaNova (Whisper) | pip install "pipecat-ai[sambanova]" |
| Sarvam | pip install "pipecat-ai[sarvam]" |
| Soniox | pip install "pipecat-ai[soniox]" |
| Speechmatics | pip install "pipecat-ai[speechmatics]" |
| Whisper | pip install "pipecat-ai[whisper]" |
Large Language Models
LLMs receive text or audio based input and output a streaming text response.| Service | Setup |
|---|---|
| Anthropic | pip install "pipecat-ai[anthropic]" |
| AWS Bedrock | pip install "pipecat-ai[aws]" |
| Azure | pip install "pipecat-ai[azure]" |
| Cerebras | pip install "pipecat-ai[cerebras]" |
| DeepSeek | pip install "pipecat-ai[deepseek]" |
| Fireworks AI | pip install "pipecat-ai[fireworks]" |
| Google Gemini | pip install "pipecat-ai[google]" |
| Google Vertex AI | pip install "pipecat-ai[google]" |
| Grok | pip install "pipecat-ai[grok]" |
| Groq | pip install "pipecat-ai[groq]" |
| NVIDIA | pip install "pipecat-ai[nvidia]" |
| Ollama | pip install "pipecat-ai[ollama]" |
| OpenAI | pip install "pipecat-ai[openai]" |
| OpenPipe | pip install "pipecat-ai[openpipe]" |
| OpenRouter | pip install "pipecat-ai[openrouter]" |
| Perplexity | pip install "pipecat-ai[perplexity]" |
| Qwen | pip install "pipecat-ai[qwen]" |
| SambaNova | pip install "pipecat-ai[sambanova]" |
| Together AI | pip install "pipecat-ai[together]" |
Text-to-Speech
Text-to-Speech services receive text input and output audio streams or chunks.| Service | Setup |
|---|---|
| Async | pip install "pipecat-ai[asyncai]" |
| AWS Polly | pip install "pipecat-ai[aws]" |
| Azure | pip install "pipecat-ai[azure]" |
| Camb AI | pip install "pipecat-ai[camb]" |
| Cartesia | pip install "pipecat-ai[cartesia]" |
| Deepgram | pip install "pipecat-ai[deepgram]" |
| ElevenLabs | pip install "pipecat-ai[elevenlabs]" |
| Fish | pip install "pipecat-ai[fish]" |
pip install "pipecat-ai[google]" | |
| Gradium | pip install "pipecat-ai[gradium]" |
| Groq | pip install "pipecat-ai[groq]" |
| Hathora | pip install "pipecat-ai[hathora]" |
| Hume | pip install "pipecat-ai[hume]" |
| Inworld | No dependencies required |
| LMNT | pip install "pipecat-ai[lmnt]" |
| MiniMax | No dependencies required |
| Neuphonic | pip install "pipecat-ai[neuphonic]" |
| NVIDIA | pip install "pipecat-ai[nvidia]" |
| OpenAI | pip install "pipecat-ai[openai]" |
| Piper | No dependencies required |
| PlayHT | pip install "pipecat-ai[playht]" |
| ResembleAI | pip install "pipecat-ai[resembleai]" |
| Rime | pip install "pipecat-ai[rime]" |
| Sarvam | No dependencies required |
| Speechmatics | pip install "pipecat-ai[speechmatics]" |
| XTTS | pip install "pipecat-ai[xtts]" |
Speech-to-Speech
Speech-to-Speech services are multi-modal LLM services that take in audio, video, or text and output audio or text.| Service | Setup |
|---|---|
| AWS Nova Sonic | pip install "pipecat-ai[aws-nova-sonic]" |
| Gemini Multimodal Live | pip install "pipecat-ai[google]" |
| Gemini Live Vertex AI | pip install "pipecat-ai[google]" |
| Grok Voice Agent | pip install "pipecat-ai[grok]" |
| OpenAI Realtime | pip install "pipecat-ai[openai]" |
| Ultravox | pip install "pipecat-ai[ultravox]" |
Image Generation
Image generation services receive text inputs and output images.Video
Video services enable you to build an avatar where audio and video are synchronized.Memory
Memory services can be used to store and retrieve conversations.| Service | Setup |
|---|---|
| mem0 | pip install "pipecat-ai[mem0]" |
Vision
Vision services receive a streaming video input and output text describing the video input.| Service | Setup |
|---|---|
| Moondream | pip install "pipecat-ai[moondream]" |
Analytics & Monitoring
Analytics services help you better understand how your service operates.| Service | Setup |
|---|---|
| Sentry | pip install "pipecat-ai[sentry]" |