DeepSeek
Text Generation
DeepSeek-V4-Pro
DeepSeek-V4-Pro is DeepSeek's flagship open-source MoE model with 1.6T total parameters and 49B activated, purpose-built for frontier-level reasoning, coding, and agentic tasks. Supporting a 1M-token context window and three reasoning effort modes up to Think Max, it achieves top-tier performance on coding benchmarks such as LiveCodeBench and Codeforces — rivaling leading closed-source models — and is released under the MIT License....
Total Context:
1049K
Max output:
393K
Input:
$
1.74
/ M Tokens
Cached Input:
$
text
/ M Tokens
Output:
$
3.48
/ M Tokens
DeepSeek
Text Generation
DeepSeek-V4-Flash
DeepSeek-V4-Flash is DeepSeek's latest open-source MoE model featuring 284B total parameters with only 13B activated during inference, delivering high-speed generation without sacrificing capability. With native support for a 1M-token context window and three switchable reasoning modes — Non-Think, Think High, and Think Max — it offers flexible intelligence scaling from everyday tasks to complex reasoning, all under the MIT License....
Total Context:
1049K
Max output:
393K
Input:
$
0.14
/ M Tokens
Cached Input:
$
text
/ M Tokens
Output:
$
0.28
/ M Tokens
DeepSeek
Text Generation
DeepSeek-V3.2
DeepSeek-V3.2 is a model that harmonizes high computational efficiency with superior reasoning and agent performance. Its approach is built upon three key technical breakthroughs: DeepSeek Sparse Attention (DSA), an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios; a Scalable Reinforcement Learning Framework, which enables performance comparable to GPT-5 and reasoning proficiency on par with Gemini-3.0-Pro in its high-compute variant; and a Large-Scale Agentic Task Synthesis Pipeline to integrate reasoning into tool-use scenarios, improving compliance and generalization in complex interactive environments. The model has achieved gold-medal performance in the 2025 International Mathematical Olympiad (IMO) and International Olympiad in Informatics (IOI)...
Total Context:
164K
Max output:
164K
Input:
$
0.27
/ M Tokens
Cached Input:
$
text
/ M Tokens
Output:
$
0.42
/ M Tokens
DeepSeek
Text Generation
DeepSeek-V3.2-Exp
DeepSeek-V3.2-Exp is an experimental version of DeepSeek model, built on V3.1-Terminus. It debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context....
Total Context:
164K
Max output:
164K
Input:
$
0.27
/ M Tokens
Cached Input:
$
text
/ M Tokens
Output:
$
0.41
/ M Tokens
DeepSeek
Text Generation
DeepSeek-V3.1-Terminus
DeepSeek-V3.1-Terminus is an updated version built on V3.1’s strengths while addressing key user feedback. It improves in language consistency, reducing instances of mixed Chinese-English text and occasional abnormal characters. And also upgrades in stronger Code Agent & Search Agent performance....
Total Context:
164K
Max output:
164K
Input:
$
0.27
/ M Tokens
Cached Input:
$
text
/ M Tokens
Output:
$
1.0
/ M Tokens
DeepSeek
Text Generation
DeepSeek-V3.1
DeepSeek-V3.1 is a hybrid model that supports both thinking mode and non-thinking mode. Through post-training optimization, the model's performance in tool usage and agent tasks has significantly improved. DeepSeek-V3.1-Think achieves comparable answer quality to DeepSeek-R1-0528, while responding more quickly....
Total Context:
164K
Max output:
164K
Input:
$
0.27
/ M Tokens
Cached Input:
$
text
/ M Tokens
Output:
$
1.0
/ M Tokens
DeepSeek
Text Generation
DeepSeek-V3
DeepSeek-V3-0324 demonstrates notable improvements over its predecessor, DeepSeek-V3, in several key aspects, including major boost in reasoning performance, stronger front-end development skills and smarter tool-use capabilities....
Total Context:
164K
Max output:
164K
Input:
$
0.25
/ M Tokens
Cached Input:
$
text
/ M Tokens
Output:
$
1.0
/ M Tokens
DeepSeek
Text Generation
DeepSeek-R1
DeepSeek-R1-0528 is an upgraded model shows significant improvements in handling complex reasoning tasks,also offers a reduced hallucination rate, enhanced support for function calling, and better experience for vibe coding. It achieves performance comparable to O3 and Gemini 2.5 Pro....
Total Context:
164K
Max output:
164K
Input:
$
0.5
/ M Tokens
Cached Input:
$
text
/ M Tokens
Output:
$
2.18
/ M Tokens

