DeepSeek-V3.1 on SiliconFlow: Hybrid Thinking, Smarter Tools and 164K Context Window

Sep 2, 2025

DeepSeek-V3.1 on SiliconFlow
DeepSeek-V3.1 on SiliconFlow
DeepSeek-V3.1 on SiliconFlow

TL;DR: DeepSeek-V3.1 is live on SiliconFlow! With advanced reasoning, 164K context window, and blazing-fast efficiency, you can now integrate SiliconFlow's DeepSeek-V3.1 API directly into Claude Code — cutting costs while enhancing your workflow!

SiliconFlow is thrilled to bring DeepSeek-V3.1 to our model catalog — the latest upgrade from DeepSeek that takes AI one step closer to the Agent era. With its Hybrid Thinking Mode, users can control to switch between standard and deep reasoning as needed, while smarter tool calling and faster reasoning deliver a smoother user experience. On top of that, SiliconFlow supports context window up to 164K, enabling richer conversations, longer document handling, and more complex tasks with ease.

With SiliconFlow's DeepSeek-V3.1 API, you can expect:

  • Budget-Friendly Pricing: DeepSeek-V3.1 $0.27/M tokens (input) and $1.1/M tokens (output).

  • Extended Context Window: 164K context window for complex tasks.

Whether you're a startup or an enterprise, SiliconFlow provides production-ready APIs that integrate seamlessly into real-world applications — at a fraction of the cost.

DeepSeek-V3.1's Breakthrough Performance

Compared to the previous version, this upgrade brings improvements in multiple aspects:

  • Hybrid thinking mode: One model supports both thinking mode and non-thinking mode by changing the chat template.

  • Smarter tool calling: Through post-training optimization, the model's performance in tool usage and agent tasks has significantly improved.

  • Higher thinking efficiency: DeepSeek-V3.1-Think achieves comparable answer quality to DeepSeek-R1-0528, while responding more quickly.

Behind these gains is a major leap in training. DeepSeek-V3.1 expands the long-context extension pipeline with 10× more data for the 32K phase (630B tokens) and 3.3× more for the 128K phase (209B tokens). By incorporating additional long-document data and training with the UE8M0 FP8 format, the model turns larger-scale training into faster inference, higher efficiency, and seamless compatibility with modern infrastructure.

What This Means to You:

  • Longer Context Window: Review entire legal contracts in a single pass, analyze large codebases without chunking, or process research papers end-to-end. Through SiliconFlow, you can even acess a more extended 164K context window (~130,000 words) — equivalent to processing 4-5 copies of The Old Man and the Sea in a single session — for handling exceptionally large documents and complex workflows.

  • Better Performance Across Domains: Whether it's solving complex math problems, drafting technical documentation, or handling cross-disciplinary reasoning tasks, DeepSeek-V3.1 delivers more accurate and reliable outputs powered by expanded training data.

  • Faster and More Efficient Workflows: Thanks to UE8M0 FP8 precision, you can generate responses faster and at lower compute cost — meaning quicker iteration for startups and more efficient scaling for enterprises.

Benchmark Performance

The gains of DeepSeek-V3.1 aren't just theoretical — they translate into measurable improvements across real-world benchmarks:

  • Coding & Execution: In code fixing evaluations on SWE and complex tasks in command-line terminal environments (Terminal-Bench), DeepSeek-V3.1 shows notable gains compared to previous DeepSeek series models.

Category

Benchmark (Metric)

DeepSeek V3.1-NonThinking

DeepSeek V3 0324

DeepSeek V3.1-Thinking

DeepSeek R1 0528

Code

SWE Verified

🥇66

45.4

-

44.6

SWE-bench Multilingual

🥇54.5

29.3

-

30.5

Terminal-bench

🥇31.3

13.3

-

5.7

  • Search & Reasoning: DeepSeek-V3.1 achieves significant improvements across multiple search evaluation metrics. In complex search tests requiring multi-step reasoning (BrowseComp) and multidisciplinary expert-level problem tests (HLE), DeepSeek-V3.1's performance substantially surpasses R1-0528.

Category

Benchmark (Metric)

DeepSeek V3.1-NonThinking

DeepSeek V3 0324

DeepSeek V3.1-Thinking

DeepSeek R1 0528

Search Agent

BrowseComp

-

-

🥇30

8.9

BrowseComp_zh

-

-

🥇49.2

35.7

Humanity's Last Exam

-

-

🥇29.8

24.8

SimpleQA

-

-

🥇93.4

92.3







Use SiliconFlow's DeepSeek-V3.1 API in Claude Code

Now, with Claude Code supporting DeepSeek models, you can integrate SiliconFlow's DeepSeek-V3.1 API.

Step 1: Get Your SiliconFlow API Key

  1. Log in to your SiliconFlow dashboard.

  2. Navigate to API Keys section.

  3. Generate a new API key for DeepSeek V3.1 access.

  4. Copy and secure your API key./

Step 2: Configure Environment Variables

Open your terminal and set the following environment variables:

export ANTHROPIC_BASE_URL="https://api.siliconflow.com/v1"
export ANTHROPIC_MODEL="deepseek-ai/DeepSeek-V3.1"  # You can modify this to use other models as needed
export ANTHROPIC_API_KEY="YOUR_SILICONFLOW_API_KEY" # Please replace with your actual API Key

Step 3: Start Using Claude Code with DeepSeek V3.1

Navigate to your project directory and launch Claude Code:

cd your-project-directory
claude

Claude Code will now use DeepSeek V3.1 via SiliconFlow's API service for all your coding assistance needs!

What's more, you can also access SiliconFlow's DeepSeek v3.1 model through gen-cli and Cline.

Gen-CLI

Gen-CLI is based on the open-source Gemini-CLI and is now available on GitHub. Install using the following steps:

  1. Ensure your system has Node.js 18+ installed.

  2. Set the API key environment variable:

export SILICONFLOW_API_KEY="YOUR_API_KEY"
  1. Run Gen-CLI:

Via npx:

npx https://github.com/gen-cli/gen-cli

Or install via npm:

npm install -g @gen-cli/gen-cli
gen

Cline

  1. In VSCode, open the command palette with Ctrl/Command+Shift+P and open Cline in a new tab for configuration.

  1. Configure in the new window:

  • API Provider: Select OpenAI Compatible

  • Base URL: https://api.siliconflow.com/v1

  • API Key: Obtain from https://cloud.siliconflow.com/account/ak

  • Model ID: Select from https://cloud.siliconflow.com/models

  1. Start using Cline.

Get Started Immediately

  1. Explore: Try DeepSeek-V3.1 in the SiliconFlow playground.

  2. Integrate: Use our OpenAI-compatible API. Explore the full API specifications in the SiliconFlow API documentation.

import requests

url = "https://api.siliconflow.com/v1/chat/completions"

payload = {
    "model": "deepseek-ai/DeepSeek-V3.1",
    "thinking_budget": 4096,
    "top_p": 0.7,
    "messages": [
        {
            "content": "tell me a story",
            "role": "user"
        }
    ],
    "enable_thinking": True
}
headers = {
    "Authorization": "Bearer <token>",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.json())

Start building with DeepSeek-V3.1 on SiliconFlow today — faster, smarter, and more cost-effective AI for your applications.

Ready to accelerate your AI development?

Ready to accelerate your AI development?

© 2025 SiliconFlow Technology PTE. LTD.

© 2025 SiliconFlow Technology PTE. LTD.

© 2025 SiliconFlow Technology PTE. LTD.