Real-time speech-to-speech service implementation using AWS Nova Sonic
The AWSNovaSonicLLMService enables natural, real-time conversations with AWS Nova Sonic. It provides built-in audio transcription, voice activity detection, and context management for creating interactive AI experiences. It provides:
Real-time Interaction
Stream audio in real-time with low latency response times
Speech Processing
Built-in speech-to-text and text-to-speech capabilities with multiple voice
options
Voice Activity Detection
Automatic detection of speech start/stop for natural conversations
Context Management
Intelligent handling of conversation history and system instructions
Specify the AWS region for the service (e.g., "us-east-1"). Note that the
service may not be available in all AWS regions: check the AWS Bedrock User
Guide’s support
table.
High-level instructions that guide the model’s behavior. Note that more
commonly these instructions will be included as part of the context provided
to kick off the conversation.
List of function definitions for tool/function calling. Note that more
commonly tools will be included as part of the context provided to kick off
the conversation.
The Params object configures the behavior of the AWS Nova Sonic model.
It is strongly recommended to stick with default values (most easily by
omitting params when constructing AWSNovaSonicLLMService) unless you have
a good understanding of the parameters and their impact. Deviating from the
defaults may lead to unexpected behavior.
This service supports function calling (also known as tool calling) which allows the LLM to request information from external services and APIs. For example, you can enable your bot to: