
Qwen3-Coder-30B-A3B-Instruct API, Deployment, Pricing
Qwen/Qwen3-Coder-30B-A3B-Instruct
Qwen3-Coder-30B-A3B-Instruct is a code model from the Qwen3 series developed by Alibaba's Qwen team. As a streamlined and optimized model, it maintains impressive performance and efficiency while focusing on enhanced coding capabilities. It demonstrates significant performance advantages among open-source models on complex tasks such as Agentic Coding, Agentic Browser-Use, and other foundational coding tasks. The model natively supports a long context of 256K tokens, which can be extended up to 1M tokens, enabling better repository-scale understanding and processing. Furthermore, it provides robust agentic coding support for platforms like Qwen Code and CLINE, featuring a specially designed function call format
Details
Model Provider
Qwen
Type
text
Sub Type
chat
Size
text
Publish Time
Aug 1, 2025
Input Price
$
0.07
/ M Tokens
Output Price
$
0.28
/ M Tokens
Context length
262K
Tags
MoE,235B,128K
Compare with Other Models
See how this model stacks up against others.

Qwen
chat
Qwen3-VL-235B-A22B-Instruct
Release on: Oct 4, 2025
Qwen3-VL-235B-A22B-Instruct is a 235B parameters Mixture-of-Experts (MoE) vision-language model, with 22B activated parameters. It is an instruction-tuned version of Qwen3-VL-235B-A22B and is aligned for chat applications. Qwen3-VL is a series of multimodal models accepting both text and image inputs, and it is trained with a large amount of data. It demonstrates advanced capabilities in understanding and reasoning over text and images...
Total Context:
262K
Max output:
262K
Input:
$
0.3
/ M Tokens
Output:
$
1.5
/ M Tokens

inclusionAI
chat
Ring-1T
Release on: Oct 14, 2025
Ring-1T is an open-source, trillion-parameter thinking model released by the Bailing team. Built upon the Ling 2.0 architecture and the Ling-1T-base foundation model, it features 1 trillion total parameters with 50 billion activated parameters and supports a context window of up to 131K tokens. The model's deep reasoning and natural language inference capabilities have been significantly enhanced through large-scale verifiable reward reinforcement learning (RLVR), combined with the self-developed icepop reinforcement learning stabilization method and the efficient ASystem RL framework. Ring-1T achieves leading open-source performance on challenging reasoning benchmarks, including math competitions (e.g., IMO 2025), code generation (e.g., ICPC World Finals 2025), and logical reasoning...
Total Context:
131K
Max output:
131K
Input:
$
0.57
/ M Tokens
Output:
$
2.28
/ M Tokens
DeepSeek
chat
DeepSeek-V3.1-Terminus
Release on: Sep 29, 2025
DeepSeek-V3.1-Terminus is an updated version of the V3.1 model from DeepSeek, positioned as a hybrid, agent-oriented large language model. This update maintains the model's original capabilities while focusing on addressing user-reported issues and improving stability. It significantly enhances language consistency, reducing instances of mixed Chinese-English text and abnormal characters. The model integrates both a 'Thinking Mode' for complex, multi-step reasoning and a 'Non-thinking Mode' for direct, quick responses, switchable via the chat template. As a key enhancement, V3.1-Terminus features improved performance for its Code Agent and Search Agent, making it more reliable for tool use and executing complex, multi-step tasks...
Total Context:
164K
Max output:
164K
Input:
$
0.27
/ M Tokens
Output:
$
1.0
/ M Tokens
DeepSeek
chat
DeepSeek-R1
Release on: May 28, 2025
DeepSeek-R1-0528 is a reasoning model powered by reinforcement learning (RL) that addresses the issues of repetition and readability. Prior to RL, DeepSeek-R1 incorporated cold-start data to further optimize its reasoning performance. It achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks, and through carefully designed training methods, it has enhanced overall effectiveness...
Total Context:
164K
Max output:
164K
Input:
$
0.5
/ M Tokens
Output:
$
2.18
/ M Tokens
DeepSeek
chat
DeepSeek-R1-Distill-Qwen-14B
Release on: Jan 20, 2025
DeepSeek-R1-Distill-Qwen-14B is a distilled model based on Qwen2.5-14B. The model was fine-tuned using 800k curated samples generated by DeepSeek-R1 and demonstrates strong reasoning capabilities. It achieved impressive results across various benchmarks, including 93.9% accuracy on MATH-500, 69.7% pass rate on AIME 2024, and a rating of 1481 on CodeForces, showcasing its powerful abilities in mathematics and programming tasks...
Total Context:
131K
Max output:
131K
Input:
$
0.1
/ M Tokens
Output:
$
0.1
/ M Tokens
DeepSeek
chat
DeepSeek-R1-Distill-Qwen-32B
Release on: Jan 20, 2025
DeepSeek-R1-Distill-Qwen-32B is a distilled model based on Qwen2.5-32B. The model was fine-tuned using 800k curated samples generated by DeepSeek-R1 and demonstrates exceptional performance across mathematics, programming, and reasoning tasks. It achieved impressive results in various benchmarks including AIME 2024, MATH-500, and GPQA Diamond, with a notable 94.3% accuracy on MATH-500, showcasing its strong mathematical reasoning capabilities...
Total Context:
131K
Max output:
131K
Input:
$
0.18
/ M Tokens
Output:
$
0.18
/ M Tokens
DeepSeek
chat
DeepSeek-R1-Distill-Qwen-7B
Release on: Jan 20, 2025
DeepSeek-R1-Distill-Qwen-7B is a distilled model based on Qwen2.5-Math-7B. The model was fine-tuned using 800k curated samples generated by DeepSeek-R1 and demonstrates strong reasoning capabilities. It achieved impressive results across various benchmarks, including 92.8% accuracy on MATH-500, 55.5% pass rate on AIME 2024, and a rating of 1189 on CodeForces, showing remarkable mathematical and programming abilities for a 7B-scale model...
Total Context:
33K
Max output:
16K
Input:
$
0.05
/ M Tokens
Output:
$
0.05
/ M Tokens
DeepSeek
chat
DeepSeek-V3
Release on: Dec 26, 2024
The new version of DeepSeek-V3 (DeepSeek-V3-0324) utilizes the same base model as the previous DeepSeek-V3-1226, with improvements made only to the post-training methods. The new V3 model incorporates reinforcement learning techniques from the training process of the DeepSeek-R1 model, significantly enhancing its performance on reasoning tasks. It has achieved scores surpassing GPT-4.5 on evaluation sets related to mathematics and coding. Additionally, the model has seen notable improvements in tool invocation, role-playing, and casual conversation capabilities....
Total Context:
164K
Max output:
164K
Input:
$
0.25
/ M Tokens
Output:
$
1.0
/ M Tokens
DeepSeek
chat
DeepSeek-V3.1
Release on: Aug 25, 2025
DeepSeek-V3.1 is a hybrid large language model released by DeepSeek AI, featuring significant upgrades over its predecessor. A key innovation is the integration of both a 'Thinking Mode' for deliberative, chain-of-thought reasoning and a 'Non-thinking Mode' for direct responses, which can be switched via the chat template to suit various tasks. The model's capabilities in tool use and agent tasks have been substantially improved through post-training optimization, enabling better support for external search tools and complex multi-step instructions. DeepSeek-V3.1 is post-trained on top of the DeepSeek-V3.1-Base model, which underwent a two-phase long-context extension with a vastly expanded dataset, enhancing its ability to process long documents and codebases. As an open-source model, DeepSeek-V3.1 demonstrates performance comparable to leading closed-source models on various benchmarks, particularly in coding, math, and reasoning, while its Mixture-of-Experts (MoE) architecture maintains a massive parameter count while reducing inference costs...
Total Context:
164K
Max output:
164K
Input:
$
0.27
/ M Tokens
Output:
$
1.0
/ M Tokens
DeepSeek
chat
DeepSeek-VL2
Release on: Dec 13, 2024
DeepSeek-VL2 is a mixed-expert (MoE) vision-language model developed based on DeepSeekMoE-27B, employing a sparse-activated MoE architecture to achieve superior performance with only 4.5B active parameters. The model excels in various tasks including visual question answering, optical character recognition, document/table/chart understanding, and visual grounding. Compared to existing open-source dense models and MoE-based models, it demonstrates competitive or state-of-the-art performance using the same or fewer active parameters....
Total Context:
4K
Max output:
4K
Input:
$
0.15
/ M Tokens
Output:
$
0.15
/ M Tokens

BAIDU
chat
ERNIE-4.5-300B-A47B
Release on: Jul 2, 2025
ERNIE-4.5-300B-A47B is a large language model developed by Baidu based on a Mixture-of-Experts (MoE) architecture. The model has a total of 300 billion parameters, but only activates 47 billion parameters per token during inference, thus balancing powerful performance with computational efficiency. As one of the core models in the ERNIE 4.5 series, it is trained on the PaddlePaddle deep learning framework and demonstrates outstanding capabilities in tasks such as text understanding, generation, reasoning, and coding. The model utilizes an innovative multimodal heterogeneous MoE pre-training method, which effectively enhances its overall abilities through joint training on text and visual modalities, showing prominent results in instruction following and world knowledge memorization. Baidu has open-sourced this model along with others in the series to promote the research and application of AI technology...
Total Context:
131K
Max output:
131K
Input:
$
0.28
/ M Tokens
Output:
$
1.1
/ M Tokens

Z.ai
chat
GLM-4-32B-0414
Release on: Apr 18, 2025
GLM-4-32B-0414 is a new generation model in the GLM family with 32 billion parameters. Its performance is comparable to OpenAI's GPT series and DeepSeek's V3/R1 series, and it supports very user-friendly local deployment features. GLM-4-32B-Base-0414 was pre-trained on 15T of high-quality data, including a large amount of reasoning-type synthetic data, laying the foundation for subsequent reinforcement learning extensions. In the post-training stage, in addition to human preference alignment for dialogue scenarios, the team enhanced the model's performance in instruction following, engineering code, and function calling using techniques such as rejection sampling and reinforcement learning, strengthening the atomic capabilities required for agent tasks. GLM-4-32B-0414 achieves good results in areas such as engineering code, Artifact generation, function calling, search-based Q&A, and report generation. On several benchmarks, its performance approaches or even exceeds that of larger models like GPT-4o and DeepSeek-V3-0324 (671B)...
Total Context:
33K
Max output:
33K
Input:
$
0.27
/ M Tokens
Output:
$
0.27
/ M Tokens
Model FAQs: Usage, Deployment
Learn how to use, fine-tune, and deploy this model with ease.