Overview
NvidiaLLMService provides access to NVIDIA’s NIM language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management, with special handling for NVIDIA’s incremental token reporting and enterprise deployment.
NVIDIA NIM LLM API Reference
Pipecat’s API methods for NVIDIA NIM integration
Example Implementation
Complete example with function calling
NVIDIA NIM Documentation
Official NVIDIA NIM documentation and setup
NVIDIA Developer Portal
Access NIM services and manage API keys
Installation
To use NVIDIA NIM services, install the required dependencies:Prerequisites
NVIDIA NIM Setup
Before using NVIDIA NIM LLM services, you need:- NVIDIA Developer Account: Sign up at NVIDIA Developer Portal
- API Key: Generate an NVIDIA API key for NIM services
- Model Selection: Choose from available NIM-hosted models
- Enterprise Setup: Configure NIM for on-premises deployment if needed
Required Environment Variables
NVIDIA_API_KEY: Your NVIDIA API key for authentication
Configuration
NVIDIA API key for authentication.
Base URL for NIM API endpoint.
Model identifier to use.
InputParams
This service uses the same input parameters asOpenAILLMService. See OpenAI LLM for details.
Usage
Basic Setup
With Custom Parameters
Notes
- NVIDIA NIM uses incremental token reporting. The service accumulates token usage metrics during processing and reports the final totals at the end of each request.
- The legacy
NimLLMServiceimport frompipecat.services.nimis deprecated. UseNvidiaLLMServicefrompipecat.services.nvidiainstead. - NIM supports both cloud-hosted and on-premises deployments. For on-premises, override the
base_urlto point to your local NIM endpoint.