Create a chat completion for given messages with streaming support using Alibaba’s Qwen 2.5 72B model, optimized for reasoning, coding, and multilingual understanding
JWT token for authentication - Use your API token as the Bearer token
Model UUID for the request (Qwen 2.5 72B model identifier)
Array of messages in the conversation
Model identifier for Qwen 2.5 72B
qwen-2.5-72b Whether to stream the response
Sampling temperature (0.0 to 2.0). Lower values for more focused outputs, higher for more creative
0 <= x <= 2Maximum number of tokens to generate
1 <= x <= 32768Nucleus sampling parameter. Use lower values for more focused outputs
0 <= x <= 1Top-k sampling parameter for controlling vocabulary selection
1 <= x <= 200Sequences where the API will stop generating further tokens
4Penalty for frequent tokens to reduce repetition
-2 <= x <= 2Penalty for new tokens to encourage topic diversity
-2 <= x <= 2Penalty for repeating tokens (Qwen-specific parameter)
0.1 <= x <= 2Whether to use sampling for generation
Random seed for reproducible outputs
Successful chat completion
Response for non-streaming chat completion
Unique identifier for the completion
"chatcmpl-qwen-abc123"
Object type
chat.completion "chat.completion"
Unix timestamp of when the completion was created
1699014493
The model used for completion
"qwen-2.5-72b"
System fingerprint for the model version
"qwen-2.5-72b-v1.0"