Qwen3-Omni-30B-A3B-Instruct
About Qwen3-Omni-30B-A3B-Instruct
Qwen3-Omni-30B-A3B-Instruct is a member of the latest Qwen3 series from Alibaba's Qwen team. It is a Mixture of Experts (MoE) model with 30 billion total parameters and 3 billion active parameters, which effectively reduces inference costs while maintaining powerful performance. The model was trained on high-quality, multi-source, and multilingual data, demonstrating excellent performance in basic capabilities such as multilingual dialogue, as well as in code, math
Explore how Qwen3-Omni-30B-A3B-Instruct's advanced multimodal and multilingual capabilities can solve complex, real-world problems.
Multimodal Content Creation
Generate and refine diverse content闁炽儲鏁別xt, images, audio, video闁炽儲鏀╪suring coherence and brand consistency across all formats.
Use Case Example:
"From a text brief, the model generated a marketing video script, selected relevant stock images, and synthesized a natural-sounding voiceover in three languages, significantly reducing production time."
Real-time Multilingual Support
Deliver instant, natural-sounding customer assistance across multiple languages and modalities, including voice, chat, and video analysis.
Use Case Example:
"A customer speaking French showed a faulty device via video call; the AI instantly understood the issue, provided verbal troubleshooting steps in French, and displayed relevant diagrams."
Advanced Media Analysis
Extract deep, actionable insights from vast audio and video archives, identifying objects, transcribing speech, and detecting complex events.
Use Case Example:
"Automatically indexed hours of security footage, identifying specific vehicle models, transcribing conversations in noisy environments, and flagging unusual sound patterns like glass breaking."
Interactive Learning & Training
Create dynamic, personalized learning experiences with multimodal feedback, problem-solving, and adaptive content delivery.
Use Case Example:
"An engineering student uploaded a handwritten circuit diagram; the AI verbally explained the design flaws, guided them through corrections, and provided real-time feedback on their revised drawing."
Metadata
Specification
State
Deprecated
Architecture
Multimodal MoE
Calibrated
No
Mixture of Experts
Yes
Total Parameters
30B
Activated Parameters
3B
Reasoning
No
Precision
FP8
Context length
66K
Max Tokens
66K
Compare with Other Models
See how this model stacks up against others.

Qwen
chat
Qwen3-VL-32B-Instruct
Release on: Oct 21, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.2
/ M Tokens
Output:
$
0.6
/ M Tokens

Qwen
chat
Qwen3-VL-32B-Thinking
Release on: Oct 21, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.2
/ M Tokens
Output:
$
1.5
/ M Tokens

Qwen
chat
Qwen3-VL-8B-Instruct
Release on: Oct 15, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.18
/ M Tokens
Output:
$
0.68
/ M Tokens

Qwen
chat
Qwen3-VL-8B-Thinking
Release on: Oct 15, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.18
/ M Tokens
Output:
$
2
/ M Tokens

Qwen
chat
Qwen3-VL-235B-A22B-Instruct
Release on: Oct 4, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.3
/ M Tokens
Output:
$
1.5
/ M Tokens

Qwen
chat
Qwen3-VL-235B-A22B-Thinking
Release on: Oct 4, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.45
/ M Tokens
Output:
$
3.5
/ M Tokens

Qwen
chat
Qwen3-VL-30B-A3B-Instruct
Release on: Oct 5, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.29
/ M Tokens
Output:
$
1
/ M Tokens

Qwen
chat
Qwen3-VL-30B-A3B-Thinking
Release on: Oct 11, 2025
Total Context:
262K
Max output:
262K
Input:
$
0.29
/ M Tokens
Output:
$
1
/ M Tokens

Qwen
image-to-video
Wan2.2-I2V-A14B
Release on: Aug 13, 2025
$
0.29
/ Video
