Get Started
Use Cases
Explore the different types of applications you can build with Pipecat, from voice assistants to multimodal AI agents
Voice Assistants
Pipecat makes it easy to build voice-based AI agents that can:
- Listen to user speech and convert it to text
- Maintain conversation context across multiple exchanges
- Generate appropriate responses using LLMs
- Convert responses back to natural-sounding speech
- Handle all of this in real-time for natural conversations
Rather than dealing with the complexity of coordinating multiple AI services and managing real-time audio, Pipecat handles the orchestration for you. You can focus on defining your agent’s behavior and let Pipecat manage the technical details of real-time processing and service integration.
Multimodal Applications
Pipecat excels at handling multiple data types simultaneously:
- Audio streams for voice interaction
- Video frames for visual processing
- Text for LLM interaction
- Generated images for visual responses
Real-time AI Processing
Built to handle streaming AI workloads:
- Continuous speech recognition
- Real-time LLM interactions
- Dynamic audio/video generation
- Interactive media processing