Qwen3-235B-A22B-Instruct-2507
About Qwen3-235B-A22B-Instruct-2507
Qwen3-235B-A22B-Instruct-2507 is a flagship Mixture-of-Experts (MoE) large language model from the Qwen3 series, developed by Alibaba Cloud's Qwen team. The model has a total of 235 billion parameters, with 22 billion activated per forward pass. It was released as an updated version of the Qwen3-235B-A22B non-thinking mode, featuring significant enhancements in general capabilities such as instruction following, logical reasoning, text comprehension, mathematics, science, coding, and tool usage. Additionally, the model provides substantial gains in long-tail knowledge coverage across multiple languages and shows markedly better alignment with user preferences in subjective and open-ended tasks, enabling more helpful responses and higher-quality text generation. Notably, it natively supports an extensive 256K (262,144 tokens) context window, which enhances its capabilities for long-context understanding. This version exclusively supports the non-thinking mode and does not generate <think> blocks, aiming to deliver more efficient and precise responses for tasks like direct Q&A and knowledge retrieval
Available Serverless
Run queries immediately, pay only for usage
$
0.09
/
$
0.6
Per 1M Tokens (input/output)
Metadata
Specification
State
Available
Architecture
Calibrated
Yes
Mixture of Experts
Yes
Total Parameters
235B
Activated Parameters
22B
Reasoning
No
Precision
FP8
Context length
262K
Max Tokens
262K
Supported Functionality
Serverless
Supported
Serverless LoRA
Not supported
Fine-tuning
Not supported
Embeddings
Not supported
Rerankers
Not supported
Support image input
Not supported
JSON Mode
Supported
Structured Outputs
Not supported
Tools
Supported
Fim Completion
Not supported
Chat Prefix Completion
Supported
SiliconFlow Service
Comprehensive solutions to deploy and scale your AI applications with maximum flexibility
60%
lower latency
2x
higher throughput
65%
cost savings
Compare with Other Models
See how this model stacks up against others.

Qwen
chat
Qwen3-VL-32B-Instruct
Release on: Oct 21, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.2
/ M Tokens
Output:
$
0.6
/ M Tokens

Qwen
chat
Qwen3-VL-32B-Thinking
Release on: Oct 21, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.2
/ M Tokens
Output:
$
1.5
/ M Tokens

Qwen
chat
Qwen3-VL-8B-Instruct
Release on: Oct 15, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.18
/ M Tokens
Output:
$
0.68
/ M Tokens

Qwen
chat
Qwen3-VL-8B-Thinking
Release on: Oct 15, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.18
/ M Tokens
Output:
$
2.0
/ M Tokens

Qwen
chat
Qwen3-VL-235B-A22B-Instruct
Release on: Oct 4, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.3
/ M Tokens
Output:
$
1.5
/ M Tokens

Qwen
chat
Qwen3-VL-235B-A22B-Thinking
Release on: Oct 4, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.45
/ M Tokens
Output:
$
3.5
/ M Tokens

Qwen
chat
Qwen3-VL-30B-A3B-Instruct
Release on: Oct 5, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.29
/ M Tokens
Output:
$
1.0
/ M Tokens

Qwen
chat
Qwen3-VL-30B-A3B-Thinking
Release on: Oct 11, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.29
/ M Tokens
Output:
$
1.0
/ M Tokens

Qwen
image-to-video
Wan2.2-I2V-A14B
Release on: Aug 13, 2025
$
0.29
/ Video
Model FAQs: Usage, Deployment
Learn how to use, fine-tune, and deploy this model with ease.
