State-of-the-Art

AI Model Library

One API to run inference on 200+ cutting-edge AI models, and deploy in seconds

State-of-the-Art

AI Model Library

One API to run inference on 200+ cutting-edge AI models, and deploy in seconds

State-of-the-Art

AI Model Library

One API to run inference on 200+ cutting-edge AI models, and deploy in seconds

DeepSeek

Text Generation

DeepSeek-V3.2

Release on: Dec 4, 2025

DeepSeek-V3.2 is a model that harmonizes high computational efficiency with superior reasoning and agent performance. Its approach is built upon three key technical breakthroughs: DeepSeek Sparse Attention (DSA), an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios; a Scalable Reinforcement Learning Framework, which enables performance comparable to GPT-5 and reasoning proficiency on par with Gemini-3.0-Pro in its high-compute variant; and a Large-Scale Agentic Task Synthesis Pipeline to integrate reasoning into tool-use scenarios, improving compliance and generalization in complex interactive environments. The model has achieved gold-medal performance in the 2025 International Mathematical Olympiad (IMO) and International Olympiad in Informatics (IOI)...

Total Context:

164K

Max output:

164K

Input:

$

0.27

/ M Tokens

Output:

$

0.42

/ M Tokens

DeepSeek

Text Generation

DeepSeek-V3.2-Exp

Release on: Oct 10, 2025

DeepSeek-V3.2-Exp is an experimental version of DeepSeek model, built on V3.1-Terminus. It debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context....

Total Context:

164K

Max output:

164K

Input:

$

0.27

/ M Tokens

Output:

$

0.41

/ M Tokens

DeepSeek

Text Generation

DeepSeek-V3.1-Terminus

Release on: Sep 29, 2025

DeepSeek-V3.1-Terminus is an updated version built on V3.1’s strengths while addressing key user feedback. It improves in language consistency, reducing instances of mixed Chinese-English text and occasional abnormal characters. And also upgrades in stronger Code Agent & Search Agent performance....

Total Context:

164K

Max output:

164K

Input:

$

0.27

/ M Tokens

Output:

$

1.0

/ M Tokens

DeepSeek

Text Generation

DeepSeek-V3

Release on: Dec 26, 2024

DeepSeek-V3-0324 demonstrates notable improvements over its predecessor, DeepSeek-V3, in several key aspects, including major boost in reasoning performance, stronger front-end development skills and smarter tool-use capabilities....

Total Context:

164K

Max output:

164K

Input:

$

0.25

/ M Tokens

Output:

$

1.0

/ M Tokens

Black Forest Labs

Text-to-Image

FLUX.2 [flex]

Release on: Dec 11, 2025

$

0.06

/ Image

Moonshot AI

Text Generation

Kimi-K2-Instruct-0905

Release on: Sep 8, 2025

Kimi K2-Instruct-0905, a state-of-the-art mixture-of-experts (MoE) language model, is the latest, most capable version of Kimi K2. Key Features include enhanced coding capabilities, esp. front-end & tool-calling, context length extended to 256k tokens, and improved integration with various agent scaffolds....

Total Context:

262K

Max output:

262K

Input:

$

0.4

/ M Tokens

Output:

$

2.0

/ M Tokens

OpenAI

Text Generation

gpt-oss-120b

Release on: Aug 13, 2025

The gpt-oss series is OpenAI’s open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. gpt-oss-120b is for production, general purpose, high reasoning use cases that fit into a single 80GB GPU (like NVIDIA H100 or AMD MI300X)....

Total Context:

131K

Max output:

8K

Input:

$

0.05

/ M Tokens

Output:

$

0.45

/ M Tokens

OpenAI

Text Generation

gpt-oss-20b

Release on: Aug 13, 2025

The gpt-oss series is OpenAI’s open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. gpt-oss-20b is for lower latency, and local or specialized use cases....

Total Context:

131K

Max output:

8K

Input:

$

0.04

/ M Tokens

Output:

$

0.18

/ M Tokens

Z.ai

Text Generation

GLM-4.6

Release on: Oct 4, 2025

Compared with GLM-4.5, GLM-4.6 brings several key improvements, including longer context window expanded to 200K tokens, superior coding performance, advanced reasoning, more capable agents, and refined writing....

Total Context:

205K

Max output:

205K

Input:

$

0.39

/ M Tokens

Output:

$

1.9

/ M Tokens

Ready to accelerate your AI development?

Ready to accelerate your AI development?

Ready to accelerate your AI development?

English

© 2025 SiliconFlow

English

© 2025 SiliconFlow

English

© 2025 SiliconFlow