State-of-the-Art

AI Model Library

One API to run inference on 200+ cutting-edge AI models, and deploy in seconds

State-of-the-Art

AI Model Library

One API to run inference on 200+ cutting-edge AI models, and deploy in seconds

State-of-the-Art

AI Model Library

One API to run inference on 200+ cutting-edge AI models, and deploy in seconds

DeepSeek

Text Generation

DeepSeek-V4-Pro

DeepSeek-V4-Pro is DeepSeek's flagship open-source MoE model with 1.6T total parameters and 49B activated, purpose-built for frontier-level reasoning, coding, and agentic tasks. Supporting a 1M-token context window and three reasoning effort modes up to Think Max, it achieves top-tier performance on coding benchmarks such as LiveCodeBench and Codeforces — rivaling leading closed-source models — and is released under the MIT License....

Total Context:

1049K

Max output:

393K

Input:

$

1.74

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

3.48

/ M Tokens

DeepSeek

Text Generation

DeepSeek-V4-Flash

DeepSeek-V4-Flash is DeepSeek's latest open-source MoE model featuring 284B total parameters with only 13B activated during inference, delivering high-speed generation without sacrificing capability. With native support for a 1M-token context window and three switchable reasoning modes — Non-Think, Think High, and Think Max — it offers flexible intelligence scaling from everyday tasks to complex reasoning, all under the MIT License....

Total Context:

1049K

Max output:

393K

Input:

$

0.14

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

0.28

/ M Tokens

DeepSeek

Text Generation

DeepSeek-V3.2

DeepSeek-V3.2 is a model that harmonizes high computational efficiency with superior reasoning and agent performance. Its approach is built upon three key technical breakthroughs: DeepSeek Sparse Attention (DSA), an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios; a Scalable Reinforcement Learning Framework, which enables performance comparable to GPT-5 and reasoning proficiency on par with Gemini-3.0-Pro in its high-compute variant; and a Large-Scale Agentic Task Synthesis Pipeline to integrate reasoning into tool-use scenarios, improving compliance and generalization in complex interactive environments. The model has achieved gold-medal performance in the 2025 International Mathematical Olympiad (IMO) and International Olympiad in Informatics (IOI)...

Total Context:

164K

Max output:

164K

Input:

$

0.27

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

0.42

/ M Tokens

DeepSeek

Text Generation

DeepSeek-V3.2-Exp

DeepSeek-V3.2-Exp is an experimental version of DeepSeek model, built on V3.1-Terminus. It debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context....

Total Context:

164K

Max output:

164K

Input:

$

0.27

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

0.41

/ M Tokens

DeepSeek

Text Generation

DeepSeek-V3.1-Terminus

DeepSeek-V3.1-Terminus is an updated version built on V3.1’s strengths while addressing key user feedback. It improves in language consistency, reducing instances of mixed Chinese-English text and occasional abnormal characters. And also upgrades in stronger Code Agent & Search Agent performance....

Total Context:

164K

Max output:

164K

Input:

$

0.27

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

1.0

/ M Tokens

DeepSeek

Text Generation

DeepSeek-V3.1

DeepSeek-V3.1 is a hybrid model that supports both thinking mode and non-thinking mode. Through post-training optimization, the model's performance in tool usage and agent tasks has significantly improved. DeepSeek-V3.1-Think achieves comparable answer quality to DeepSeek-R1-0528, while responding more quickly....

Total Context:

164K

Max output:

164K

Input:

$

0.27

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

1.0

/ M Tokens

DeepSeek

Text Generation

DeepSeek-V3

DeepSeek-V3-0324 demonstrates notable improvements over its predecessor, DeepSeek-V3, in several key aspects, including major boost in reasoning performance, stronger front-end development skills and smarter tool-use capabilities....

Total Context:

164K

Max output:

164K

Input:

$

0.25

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

1.0

/ M Tokens

DeepSeek

Text Generation

DeepSeek-R1

DeepSeek-R1-0528 is an upgraded model shows significant improvements in handling complex reasoning tasks,also offers a reduced hallucination rate, enhanced support for function calling, and better experience for vibe coding. It achieves performance comparable to O3 and Gemini 2.5 Pro....

Total Context:

164K

Max output:

164K

Input:

$

0.5

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

2.18

/ M Tokens

Ready to accelerate your AI development?

Ready to accelerate your AI development?

Ready to accelerate your AI development?