Qwen
LLM service implementation using Alibaba Cloud’s Qwen models through an OpenAI-compatible interface
Overview
QwenLLMService
provides access to Alibaba Cloud’s Qwen language models through an OpenAI-compatible interface. It inherits from OpenAILLMService
and supports streaming responses, function calling, and context management, with particularly strong capabilities for Chinese language processing.
Installation
To use QwenLLMService
, install the required dependencies:
You’ll also need to set up your Qwen API key as an environment variable: QWEN_API_KEY
Configuration
Constructor Parameters
Your DashScope API key for accessing Qwen models
Model identifier (see Available Models section)
Qwen API endpoint (OpenAI-compatible mode)
Input Parameters
Inherits OpenAI-compatible parameters:
Reduces likelihood of repeating tokens based on their frequency. Range: [-2.0, 2.0]
Maximum number of tokens to generate. Must be greater than or equal to 1
Reduces likelihood of repeating any tokens that have appeared. Range: [-2.0, 2.0]
Controls randomness in the output. Range: [0.0, 2.0]
Controls diversity via nucleus sampling. Range: [0.0, 1.0]
Usage Example
Methods
See the LLM base class methods for additional functionality.
Function Calling
This service supports function calling (also known as tool calling) which allows the LLM to request information from external services and APIs. For example, you can enable your bot to:
- Check current weather conditions
- Query databases
- Access external APIs
- Perform custom actions
Function Calling Guide
Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications.
Available Models
Commercial Models
Model Name | Description | Context Size | Strengths |
---|---|---|---|
qwen-max | Best performance | 32K tokens | Complex reasoning, multi-step tasks |
qwen-plus | Balanced performance | 131K tokens | Good balance of quality, speed, and cost |
qwen-turbo | Fast and affordable | 1M tokens | Simple tasks, high throughput |
Open Source Models
Model Name | Description | Context Size |
---|---|---|
qwen2.5-72b-instruct | Largest open source model | 131K tokens |
qwen2.5-32b-instruct | Medium-sized model | 131K tokens |
qwen2.5-14b-instruct | Smaller model | 131K tokens |
qwen2.5-7b-instruct | Smallest model | 131K tokens |
qwen2.5-14b-instruct-1m | Long context model | 1M tokens |
qwen2.5-7b-instruct-1m | Long context model | 1M tokens |
See Alibaba Cloud’s Model Studio documentation for a complete and up-to-date list of supported models.
Chinese Language Support
Qwen models have excellent support for Chinese language processing:
- Native Chinese language capabilities in all models
- Support for both Simplified and Traditional Chinese
- Strong performance on Chinese linguistic tasks
- Culturally appropriate responses for Chinese contexts
For optimal Chinese language support:
- Use a system prompt in both English and Chinese
- Consider using specialized Chinese models when available
- Pair with
QwenTTSService
for high-quality Chinese speech synthesis
Frame Flow
Inherits the OpenAI LLM Service frame flow:
Metrics Support
The service collects standard LLM metrics:
- Token usage (prompt and completion)
- Processing duration
- Time to First Byte (TTFB)
- Function call metrics
Notes
- OpenAI-compatible interface
- Superior Chinese language capabilities
- Supports long contexts (up to 1M tokens in some models)
- Handles function calling
- Manages conversation context
- Includes token usage tracking
- Thread-safe processing
- Automatic error handling