NVIDIA NIM

Overview

NvidiaLLMService provides access to NVIDIA’s NIM language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management, with special handling for NVIDIA’s incremental token reporting and enterprise deployment.

NVIDIA NIM LLM API Reference

Pipecat’s API methods for NVIDIA NIM integration

Example Implementation

Complete example with function calling

NVIDIA NIM Documentation

Official NVIDIA NIM documentation and setup

NVIDIA Developer Portal

Access NIM services and manage API keys

Installation

To use NVIDIA NIM services, install the required dependencies:

pip install "pipecat-ai[nvidia]"

Prerequisites

NVIDIA NIM Setup

Before using NVIDIA NIM LLM services, you need:

NVIDIA Developer Account: Sign up at NVIDIA Developer Portal
API Key: Generate an NVIDIA API key for NIM services
Model Selection: Choose from available NIM-hosted models
Enterprise Setup: Configure NIM for on-premises deployment if needed

Required Environment Variables

NVIDIA_API_KEY: Your NVIDIA API key for authentication

Configuration

api_key

str

required

NVIDIA API key for authentication.

base_url

str

default:"https://integrate.api.nvidia.com/v1"

Base URL for NIM API endpoint.

model

str

default:"nvidia/llama-3.1-nemotron-70b-instruct"

Model identifier to use.

InputParams

This service uses the same input parameters as OpenAILLMService. See OpenAI LLM for details.

Usage

Basic Setup

import os
from pipecat.services.nvidia import NvidiaLLMService

llm = NvidiaLLMService(
    api_key=os.getenv("NVIDIA_API_KEY"),
    model="nvidia/llama-3.1-nemotron-70b-instruct",
)

With Custom Parameters

from pipecat.services.nvidia import NvidiaLLMService

llm = NvidiaLLMService(
    api_key=os.getenv("NVIDIA_API_KEY"),
    model="nvidia/llama-3.1-nemotron-70b-instruct",
    params=NvidiaLLMService.InputParams(
        temperature=0.7,
        top_p=0.9,
        max_completion_tokens=1024,
    ),
)

Notes

NVIDIA NIM uses incremental token reporting. The service accumulates token usage metrics during processing and reports the final totals at the end of each request.
The legacy NimLLMService import from pipecat.services.nim is deprecated. Use NvidiaLLMService from pipecat.services.nvidia instead.
NIM supports both cloud-hosted and on-premises deployments. For on-premises, override the base_url to point to your local NIM endpoint.

API Reference

Services

Utilities

Frameworks

Pipeline

Overview

NVIDIA NIM LLM API Reference

Example Implementation

NVIDIA NIM Documentation

NVIDIA Developer Portal

Installation

Prerequisites

NVIDIA NIM Setup

Required Environment Variables

Configuration

InputParams

Usage

Basic Setup

With Custom Parameters

Notes

API Reference

Services

Utilities

Frameworks

Pipeline

​Overview

NVIDIA NIM LLM API Reference

Example Implementation

NVIDIA NIM Documentation

NVIDIA Developer Portal

​Installation

​Prerequisites

​NVIDIA NIM Setup

​Required Environment Variables

​Configuration

​InputParams

​Usage

​Basic Setup

​With Custom Parameters

​Notes

Overview

Installation

Prerequisites

NVIDIA NIM Setup

Required Environment Variables

Configuration

InputParams

Usage

Basic Setup

With Custom Parameters

Notes