What is TTFS?
Time To Final Segment (TTFS) measures how long it takes from the moment a user stops speaking until the STT service delivers the final transcript. This latency directly affects how long your bot waits before it starts responding.Why TTFS matters
TTFS feeds directly into turn stop strategies, which decide when the user has finished speaking and the bot should respond.- Value too low: The turn stop strategy gives up waiting before the final transcript arrives. The bot responds based on incomplete text, or misses the user’s input entirely.
- Value too high: The bot waits longer than necessary after the user stops speaking, creating awkward pauses in the conversation.
- Value just right: The bot waits long enough for the transcript to arrive, then responds immediately.
Default P99 latency values
Pipecat includes measured P99 TTFS values for every supported STT service. These are used automatically when you create a service — no configuration required. You can see default values in the source code.Local services (NVIDIA, Whisper) default to 1.0s since actual latency
depends entirely on your hardware. Always measure and override for local
deployments.
Measuring latency for your deployment
The default values are measured under standard conditions, but your actual latency depends on:- Network distance to the STT provider
- Region where the service is hosted
- Service configuration (model size, language, features enabled)
- Audio quality and encoding settings
Overriding the default value
Pass thettfs_p99_latency parameter to any STT service constructor to override the built-in default:
STTMetadataFrame at startup, so turn stop strategies automatically adjust their timing.