> ## Documentation Index > Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt > Use this file to discover all available pages before exploring further. # Nebius > Large Language Model services using Nebius Token Factory's OpenAI-compatible API ## Overview `NebiusLLMService` provides chat completion capabilities using Nebius Token Factory's API with OpenAI-compatible interface. It supports streaming responses and function calling. Pipecat's API methods for Nebius integration Function calling example with Nebius Official Nebius documentation Access models and manage API keys ## Installation To use Nebius LLM services, install the required dependencies: ```bash theme={null} uv add "pipecat-ai[nebius]" ``` ## Prerequisites ### Nebius Account Setup Before using Nebius LLM services, you need: 1. **Nebius Account**: Sign up at [Nebius](https://nebius.com/) 2. **API Key**: Generate an API key from the Token Factory dashboard 3. **Model Selection**: Choose from available models (default: `Qwen/Qwen3-30B-A3B-Instruct-2507`) ### Required Environment Variables * `NEBIUS_API_KEY`: Your Nebius API key for authentication ## Configuration The API key for accessing Nebius's API. The base URL for the Nebius API. Override if using a different endpoint. Runtime-configurable model settings. See [Settings](#settings) below. ### Settings Runtime-configurable settings passed via the `settings` constructor argument using `NebiusLLMService.Settings(...)`. These can be updated mid-conversation with `LLMUpdateSettingsFrame`. See [Service Settings](/guides/fundamentals/service-settings) for details. | Parameter | Type | Default | Description | | ------------------- | ------- | ------------------------------------ | -------------------------------------------------------------------------------------- | | `model` | `str` | `"Qwen/Qwen3-30B-A3B-Instruct-2507"` | Nebius model identifier. Check Nebius Token Factory for available models. | | `temperature` | `float` | `NOT_GIVEN` | Sampling temperature (0.0 to 2.0). Lower values are more focused, higher are creative. | | `max_tokens` | `int` | `NOT_GIVEN` | Maximum tokens to generate. | | `top_p` | `float` | `NOT_GIVEN` | Top-p (nucleus) sampling (0.0 to 1.0). Controls diversity of output. | | `frequency_penalty` | `float` | `NOT_GIVEN` | Penalty for frequent tokens (-2.0 to 2.0). Positive values discourage repetition. | | `presence_penalty` | `float` | `NOT_GIVEN` | Penalty for new topics (-2.0 to 2.0). Positive values encourage new topics. | `NOT_GIVEN` values are omitted from the API request entirely, letting the Nebius API use its own defaults. This is different from `None`, which would be sent explicitly. ## Usage ### Basic Setup ```python theme={null} import os from pipecat.services.nebius import NebiusLLMService llm = NebiusLLMService( api_key=os.getenv("NEBIUS_API_KEY"), ) ``` ### With Custom Settings ```python theme={null} import os from pipecat.services.nebius import NebiusLLMService llm = NebiusLLMService( api_key=os.getenv("NEBIUS_API_KEY"), settings=NebiusLLMService.Settings( model="Qwen/Qwen3-30B-A3B-Instruct-2507", temperature=0.7, max_tokens=1000, ), ) ``` ### Updating Settings at Runtime Model settings can be changed mid-conversation using `LLMUpdateSettingsFrame`: ```python theme={null} from pipecat.frames.frames import LLMUpdateSettingsFrame from pipecat.services.nebius.llm import NebiusLLMSettings await worker.queue_frame( LLMUpdateSettingsFrame( delta=NebiusLLMSettings( temperature=0.3, ) ) ) ``` ## Notes * **OpenAI Compatibility**: Nebius's API is OpenAI-compatible, allowing use of familiar patterns and parameters. * **Function Calling**: Supports OpenAI-style tool/function calling format. * **Streaming**: Supports streaming responses for real-time interaction. ## Event Handlers `NebiusLLMService` supports the following event handlers, inherited from [LLMService](/server/events/service-events): | Event | Description | | --------------------------- | ----------------------------------------------------------------------- | | `on_completion_timeout` | Called when an LLM completion request times out | | `on_function_calls_started` | Called when function calls are received and execution is about to start | ```python theme={null} @llm.event_handler("on_completion_timeout") async def on_completion_timeout(service): print("LLM completion timed out") ```