NVIDIA Riva

Overview

NVIDIA Riva provides two STT service implementations: NvidiaSTTService for real-time streaming transcription using Parakeet models, and NvidiaSegmentedSTTService for segmented transcription using Canary models with advanced language support and enterprise-grade accuracy.

NVIDIA Riva STT API Reference

Pipecat’s API methods for NVIDIA Riva STT integration

Example Implementation

Complete example with NVIDIA services integration

NVIDIA Riva Documentation

Official NVIDIA Riva ASR documentation

NVIDIA Developer Portal

Access API keys and Riva services

Installation

To use NVIDIA Riva services, install the required dependency:

pip install "pipecat-ai[nvidia]"

Prerequisites

NVIDIA Riva Setup

Before using NVIDIA Riva STT services, you need:

NVIDIA Developer Account: Sign up at NVIDIA Developer Portal
API Key: Generate an NVIDIA API key for Riva services
Model Selection: Choose between Parakeet (streaming) and Canary (segmented) models

Required Environment Variables

NVIDIA_API_KEY: Your NVIDIA API key for authentication

Configuration

NvidiaSTTService

Real-time streaming transcription using NVIDIA Riva’s Parakeet models. Supports interim results and continuous audio processing.

api_key

str

required

NVIDIA API key for authentication.

server

str

default:"grpc.nvcf.nvidia.com:443"

NVIDIA Riva server address.

model_function_map

Mapping[str, str]

Mapping containing function_id and model_name for the ASR model.

sample_rate

int

default:"None"

Audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.

params

InputParams

default:"None"

Configuration parameters. See NvidiaSTTService InputParams below.

use_ssl

bool

default:"True"

Whether to use SSL for the gRPC connection.

NvidiaSTTService InputParams

Parameter	Type	Default	Description
`language`	`Language`	`Language.EN_US`	Target language for transcription.

NvidiaSegmentedSTTService

Batch/segmented transcription using NVIDIA Riva’s Canary models. Processes complete audio segments after VAD detects speech boundaries.

api_key

str

required

NVIDIA API key for authentication.

server

str

default:"grpc.nvcf.nvidia.com:443"

NVIDIA Riva server address.

model_function_map

Mapping[str, str]

Mapping containing function_id and model_name for the ASR model.

sample_rate

int

default:"None"

Audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.

params

InputParams

default:"None"

Configuration parameters. See NvidiaSegmentedSTTService InputParams below.

use_ssl

bool

default:"True"

Whether to use SSL for the gRPC connection.

NvidiaSegmentedSTTService InputParams

Parameter	Type	Default	Description
`language`	`Language`	`Language.EN_US`	Target language for transcription.
`profanity_filter`	`bool`	`False`	Whether to filter profanity from results.
`automatic_punctuation`	`bool`	`True`	Whether to add automatic punctuation.
`verbatim_transcripts`	`bool`	`False`	Whether to return verbatim transcripts.
`boosted_lm_words`	`list[str]`	`None`	List of words to boost in the language model.
`boosted_lm_score`	`float`	`4.0`	Score boost for specified words.

Usage

Streaming with Parakeet

from pipecat.services.nvidia import NvidiaSTTService

stt = NvidiaSTTService(
    api_key=os.getenv("NVIDIA_API_KEY"),
)

Segmented with Canary

from pipecat.services.nvidia import NvidiaSegmentedSTTService
from pipecat.transcriptions.language import Language

stt = NvidiaSegmentedSTTService(
    api_key=os.getenv("NVIDIA_API_KEY"),
    params=NvidiaSegmentedSTTService.InputParams(
        language=Language.ES,
        automatic_punctuation=True,
        boosted_lm_words=["Pipecat", "NVIDIA"],
        boosted_lm_score=6.0,
    ),
)

Notes

Model cannot be changed after initialization: Use the model_function_map parameter in the constructor to specify the model and function ID.
Streaming vs segmented: NvidiaSTTService provides real-time interim and final results through continuous streaming. NvidiaSegmentedSTTService processes complete audio segments for higher accuracy.
Language support: Supports Arabic, English (US/GB), French, German, Hindi, Italian, Japanese, Korean, Portuguese (BR), Russian, and Spanish (ES/US).
Word boosting: Use boosted_lm_words and boosted_lm_score in the segmented service to improve recognition of domain-specific terms.

API Reference

Services

Utilities

Frameworks

Pipeline

Overview

NVIDIA Riva STT API Reference

Example Implementation

NVIDIA Riva Documentation

NVIDIA Developer Portal

Installation

Prerequisites

NVIDIA Riva Setup

Required Environment Variables

Configuration

NvidiaSTTService

NvidiaSTTService InputParams

NvidiaSegmentedSTTService

NvidiaSegmentedSTTService InputParams

Usage

Streaming with Parakeet

Segmented with Canary

Notes

API Reference

Services

Utilities

Frameworks

Pipeline

​Overview

NVIDIA Riva STT API Reference

Example Implementation

NVIDIA Riva Documentation

NVIDIA Developer Portal

​Installation

​Prerequisites

​NVIDIA Riva Setup

​Required Environment Variables

​Configuration

​NvidiaSTTService

​NvidiaSTTService InputParams

​NvidiaSegmentedSTTService

​NvidiaSegmentedSTTService InputParams

​Usage

​Streaming with Parakeet

​Segmented with Canary

​Notes

Overview

Installation

Prerequisites

NVIDIA Riva Setup

Required Environment Variables

Configuration

NvidiaSTTService

NvidiaSTTService InputParams

NvidiaSegmentedSTTService

NvidiaSegmentedSTTService InputParams

Usage

Streaming with Parakeet

Segmented with Canary

Notes