Overview

QwenLLMService provides access to Alibaba Cloud’s Qwen language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management, with particularly strong capabilities for Chinese language processing.

Installation

To use QwenLLMService, install the required dependencies:

pip install "pipecat-ai[qwen]"

You’ll also need to set up your DashScope API key as an environment variable: QWEN_API_KEY.

Get your API key from Alibaba Cloud Model Studio.

Frames

Input

  • OpenAILLMContextFrame - Conversation context and history
  • LLMMessagesFrame - Direct message list
  • VisionImageRawFrame - Images for vision processing
  • LLMUpdateSettingsFrame - Runtime parameter updates

Output

  • LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries
  • LLMTextFrame - Streamed completion chunks
  • FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle
  • ErrorFrame - API or processing errors

Function Calling

Function Calling Guide

Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications.

Context Management

Context Management Guide

Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences.

Usage Example

import os
from pipecat.services.qwen.llm import QwenLLMService
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.adapters.schemas.function_schema import FunctionSchema
from pipecat.adapters.schemas.tools_schema import ToolsSchema

# Configure Qwen service
llm = QwenLLMService(
    api_key=os.getenv("QWEN_API_KEY"),
    model="qwen2.5-72b-instruct",  # High-quality open source model
    params=QwenLLMService.InputParams(
        temperature=0.7,
        max_tokens=1000
    )
)

# Define function for tool calling
weather_function = FunctionSchema(
    name="get_current_weather",
    description="Get current weather information",
    properties={
        "location": {
            "type": "string",
            "description": "City and country, e.g. Beijing, China or San Francisco, USA"
        },
        "format": {
            "type": "string",
            "enum": ["celsius", "fahrenheit"],
            "description": "Temperature unit to use"
        }
    },
    required=["location", "format"]
)

tools = ToolsSchema(standard_tools=[weather_function])

# Create bilingual context for Chinese/English support
context = OpenAILLMContext(
    messages=[
        {
            "role": "system",
            "content": """You are a helpful assistant in voice conversations.
            Keep responses concise for speech output. You can respond in Chinese when
            the user speaks Chinese, or English when they speak English.

            你是一个语音对话助手。请保持简洁的回答以适合语音输出。
            当用户用中文交流时用中文回答,用英文交流时用英文回答。"""
        }
    ],
    tools=tools
)

# Create context aggregators
context_aggregator = llm.create_context_aggregator(context)

# Register function handler
async def fetch_weather(params):
    location = params.arguments["location"]
    # Return response that works well in both languages
    await params.result_callback({
        "conditions": "sunny",
        "temperature": "22°C",
        "location": location
    })

llm.register_function("get_current_weather", fetch_weather)

# Optional: Add function call feedback
@llm.event_handler("on_function_calls_started")
async def on_function_calls_started(service, function_calls):
    await tts.queue_frame(TTSSpeakFrame("Let me check on that."))

# Use in pipeline
pipeline = Pipeline([
    transport.input(),
    stt,
    context_aggregator.user(),
    llm,
    tts,  # Consider QwenTTSService for Chinese speech
    transport.output(),
    context_aggregator.assistant()
])

Metrics

Inherits all OpenAI metrics capabilities:

  • Time to First Byte (TTFB) - Response latency measurement
  • Processing Duration - Total request processing time
  • Token Usage - Prompt tokens, completion tokens, and totals

Enable with:

task = PipelineTask(
    pipeline,
    params=PipelineParams(
        enable_metrics=True,
        enable_usage_metrics=True
    )
)

Additional Notes

  • OpenAI Compatibility: Full compatibility with OpenAI API features and parameters
  • Long Context Support: Models support up to 1M token contexts for extensive conversations
  • Multilingual Excellence: Superior performance in Chinese with strong English capabilities
  • Code-Switching: Seamlessly handles mixed Chinese-English conversations
  • Alibaba Cloud Integration: Native integration with Alibaba Cloud ecosystem