DeepSeek-V3.2 Now on SiliconFlow: Reasoning-first model built for agents

Dec 8, 2025

Table of Contents

TL;DR: DeepSeek-V3.2 (official version of V3.2-Exp) is now live on SiliconFlow. As a reasoning-first model built for agents, it combines high efficiency with GPT-5-level reasoning performance and a 164K context window. It also features tool-use capabilities in thinking mode, validated across 85K+ complex instructions and 1,800+ environments. Start building today with SiliconFlow's API to supercharge your agentic workflows.

We are thrilled to unlock access to DeepSeek's latest model on SiliconFlow, DeepSeek-V3.2, a new series that harmonizes computational efficiency with superior reasoning and agentic performance. As the first DeepSeek model to integrate thinking directly into tool-use, DeepSeek-V3.2 delivers GPT-5 level reasoning with significantly shorter outputs. Meanwhile, DeepSeek-V3.2-Speciale pushes open-source boundaries of theorem proving and coding to rival Gemini 3 Pro. Together, they set a new benchmark for developers building next-generation AI agents.

Now, through SiliconFlow's DeepSeek-V3.2 API, you can expect:

  • Cost-effective Pricing:

    • DeepSeek-V3.2 $0.27/M tokens (input) and $0.42/M tokens (output)

    • DeepSeek-V3.2-Speciale is coming soon and stay tuned for first-hand updates

  • 164K Context Window: Perfect for long documents, complex multi-turn conversations, and extended agentic tasks.

  • Seamless Integration: Instantly deploy via SiliconFlow's OpenAI-compatible API, or plug into your existing stack through Claude Code, Gen-CLI and Cline.


Whether you're building agents, coding assistants, or complex reasoning pipelines, SiliconFlow's DeepSeek-V3.2 API delivers the performance you need at a fraction of the expected cost and latency.


Why it matters

For developers building agents, multi-step reasoning pipelines or any AI system that needs to think and act, the DeepSeek-V3.2 series finally delivers the combination the industry has been waiting for: frontier-grade reasoning, integrated tool-use during thinking, and real-world efficiency:

  1. World-Leading Reasoning Capabilities

  • DeepSeek-V3.2: The Efficient "Daily Driver" for Agents

Engineered to strike the perfect balance between reasoning capabilities and output length, DeepSeek-V3.2 is your go-to choice for production workflows, such as advanced Q&A and general agent tasks.

  • Performance: Delivers reasoning capabilities on par with GPT-5.

  • Efficiency: Compared to Kimi-K2-Thinking, V3.2 has significantly shorter output lengths, translating to lower computational overhead and reduced overall generation time.


  • DeepSeek-V3.2-Speciale: Maxed-out reasoning capabilities (Research Preview)

As the enhanced long-thinking variant of V3.2, V3.2-Speciale aims to push the boundaries of open-source reasoning capabilities, integrating the theorem-proving capabilities of DeepSeek-Math-V2.

  • Gold-Medal Performance: V3.2-Speciale attains gold-level results in IMO, CMO, ICPC World Finals & IOI 2025.

  • Benchmarks: It excels in complex instruction following, rigorous mathematical reasoning and logical verification, effectively rivaling Gemini 3 Pro on mainstream reasoning leaderboards.


Benchmark

DeepSeek-V3.2 Speciale

DeepSeek-V3.2 Thinking

GPT-5 High

Gemini-3.0 Pro

Kimi-K2 Thinking

AIME 2025

🥇96.0 (23k)

93.1 (16k)

94.6 (13k)

95.0 (15k)

94.5 (24k)

HMMT Feb 2025

🥇99.2 (27k)

92.5 (19k)

88.3 (16k)

97.5 (16k)

89.4 (31k)

HMMT Nov 2025

🥇94.4 (25k)

90.2 (18k)

89.2 (20k)

93.3 (15k)

89.2 (29k)

IMOAnswerBench

🥇84.5 (45k)

78.3 (27k)

76.0 (31k)

83.3 (18k)

78.6 (37k)

LiveCodeBench

88.7 (27k)

83.3 (16k)

84.5 (13k)

90.7 (13k)

82.6 (29k)

CodeForces

2701 (77k)

2386 (42k)

2537 (29k)

2708 (22k)

-

GPQA Diamond

85.7 (16k)

82.4 (7k)

85.7 (8k)

91.9 (8k)

84.5 (12k)

HLE

30.6 (35k)

25.1 (21k)

26.3 (15k)

37.7 (15k)

23.9 (24k)

The number in parentheses indicates the approximate total token consumption.


  1. Thinking in Tool-Use

DeepSeek-V3.2 breaks the barrier between "reasoning" and "acting." Unlike previous versions where tool usage was restricted during the thinking process, DeepSeek-V3.2 is the first to seamlessly integrate thinking directly into tool-use, supporting tool invocation in both Thinking and Non-Thinking modes.

To deliver this level of agentic reliability, DeepSeek introduces a massive-scale training synthesis method:

  • Robust Generalization: The model was forged through "hard-to-solve, easy-to-verify" reinforcement learning tasks.

  • Extensive Coverage: Training spanned 1,800+ distinct environments and over 85,000+ complex instructions, significantly enhancing the model's generalization and instruction-following capability in the agent context.


Benchmark

DeepSeek-V3.2-Thinking

GPT-5-High

Gemini-3.0-Pro

Kimi-K2-Thinking

MiniMax M2

τ² Bench (Pass@1)

80.3

80.2

85.4

74.3

76.9

MCP-Universe

45.9

47.9

50.7

35.6

29.4

MCP-Mark

38

50.9

43.1

20.4

24.4

Tool Decathlon (Pass@1)

35.2

29

36.4

17.6

16


What this means for your workflows:

DeepSeek-V3.2 emerges as a highly cost-efficient alternative in agent scenarios, significantly narrowing the performance gap between open and frontier proprietary models while incurring substantially lower costs — all available via SiliconFlow's API.

Here's an example built using SiliconFlow's DeepSeek-V3.2 API, demonstrating how the model can assist in writing, optimizing, and reasoning about real-time interactive code.


What makes it powerful

DeepSeek-V3.2 series' performance is enabled by three core technical breakthroughs:

  • DeepSeek Sparse Attention (DSA):

To tackle the challenge of long-context processing, the model introduces DeepSeek Sparse Attention (DSA). This efficient attention mechanism substantially reduces computational complexity without compromising performance, specifically optimized for long-context scenarios.

  • Scalable Reinforcement Learning:

DeepSeek-V3.2 leverages a robust Reinforcement Learning (RL) protocol combined with scaled post-training compute. This advanced training framework is the key driver behind the model's exceptional reasoning capabilities.

  • Large-Scale Agentic Task Synthesis Pipeline:

DeepSeek has revolutionized agent capability through a novel Large-Scale Agentic Task Synthesis Pipeline. By systematically generating training data at scale, the model integrates reasoning directly into tool-use scenarios. This results in superior compliance and generalization, ensuring that your agents can reliably navigate complex, multi-step interactive environments with precision.


Developer-Ready Integration

Beyond DeepSeek-V3.2's industry-leading agentic performance, SiliconFlow delivers instant compatibility with your existing development ecosystem:

  • OpenAI-Compatible Tools: Seamless integration with Cline, Qwen Code, Gen-CLI, and other standard development environments — just plug in your SiliconFlow API key.

  • Anthropic-Compatible API: Works with Claude Code and any Anthropic-compatible tools for code reviews, debugging, and architectural refactoring.

  • Platform Integrations: Ready-to-use in Dify, ChatHub, Chatbox, Sider, MindSearch, DB-GPT, and also available through OpenRouter.

With powerful models, seamless integrations, and competitive pricing, SiliconFlow transforms how you build — letting you ship faster and scale smarter.


Get Started Immediately

  1. Explore: Try DeepSeek-V3.2 in the SiliconFlow playground.

  2. Integrate: Use our OpenAI-compatible API. Explore the full API specifications in the SiliconFlow API documentation.

      import requests

url = "https://api.siliconflow.com/v1/chat/completions"

payload = {
    "model": "deepseek-ai/DeepSeek-V3.2",
    "messages": [
        {
            "role": "user",
            "content": "an island near sea, with seagulls, moon shining over the sea, light house, boats int he background, fish flying over the sea"
        }
    ],
    "stream": True,
    "max_tokens": 4096,
    "enable_thinking": False,
    "thinking_budget": 4096,
    "min_p": 0,
    "stop": "1",
    "temperature": 0.7,
    "top_p": 0.7,
    "top_k": 50,
    "frequency_penalty": 0.5,
    "n": 1,
    "response_format": { "type": "json_object" },
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "<string>",
                "description": "<string>",
                "parameters": {},
                "strict": False
            }
        }
    ]
}
headers = {
    "Authorization": "Bearer <token>",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.text)


Ready to accelerate your AI development?

Ready to accelerate your AI development?

Ready to accelerate your AI development?