Basic deployment pattern for Pipecat bots
bot.py
. Your Pipecat bot / agent, containing all the pipelines that you
want to run in order to communicate with an end-user. A bot file may take some
command line arguments, such as a transport URL and configuration.bot_runner.py
. Typically a basic HTTP service that listens for incoming
user requests and spawns the relevant bot file in response.bot.py
and
bot_runner.py
for simplicity.User requests to join session via client / app
Bot runner handles the request
Bot runner spawns bot / agent
Bot instantiates and joins session via specified transport credentials
Bot runner returns status to client
/start_bot/
endpoint which listens for incoming user POST
requests or webhooks, then configures the session (such as creating rooms on your transport provider) and instantiates a new bot process.
A client will typically require some information regarding the newly spawned bot, such as a web address, so we also return some JSON with the necessary details.
bot.py
pipeline. This may be a service that you want to host and include in your deployment, or it may be a third-party service waiting for peers to connect (such as Daily, or a websocket.)
For this example, we will make use of Daily’s WebRTC transport. This will mean that our bot_runner.py
will need to do some configuration when it spawns a new bot:
bot.py
is an encapsulated entity and does not have any knowledge of the bot_runner.py
. You should provide the bot everything it needs to operate during instantiation.
Sticking to this approach helps keep things simple and makes it easier to step through debugging (if the bot launched and something goes wrong, you know to look for errors in your bot file.)
bot.py
that connects to a WebRTC session, passes audio transcription to GPT4 and returns audio text-to-speech with ElevenLabs.
We’ll also use Silero voice activity detection, to better know when the user has stopped talking.
bot_runner.py
that:
requirements.txt
:
.env
file with our service keys
bot.py
as a subprocess. When spawning the process, we pass through the transport room and token as system arguments to our bot, so it knows where to connect.
Subprocesses serve as a great way to test out your bot in the cloud without too much hassle, but depending on the size of the host machine, it will likely not hold up well under load.
Whilst some bots are just simple operators between the transport and third-party AI services (such as OpenAI), others have somewhat CPU-intensive operations, such as running and loading VAD models, so you may find you’re only able to scale this to support up to 5-10 concurrent bots.
Scaling your setup would require virtualizing your bot with it’s own set of system resources, the process of which depends on your cloud provider.
bot.py
bot_runner.py
requirements.txt
.env
Dockerfile
docker build ...
and deploy your container. Of course, you can still work with your bot in local development too: