Seed-OSS-36B-Instruct 現在可以在 SiliconFlow 上使用：更智能的 AI 隨需而動

2025年9月5日

SiliconFlow 很高興將 Seed-OSS-36B-Instruct 帶到我們的模型目錄中 - ByteDance 的革命性開源模型讓 AI 推理控制掌握在您的手中。借助其 靈活思維預算，用戶可以精確調整每個任務的推理深度，而 增強推理能力 和 代理智能 提供卓越的問題解決性能。

使用 SiliconFlow 的 Seed-OSS-36B-Instruct API，您可以期待：

有競爭力的定價： Seed-OSS-36B-Instruct $0.21/每百萬 tokens（輸入）和 $0.57/每百萬 tokens（輸出）。
262k 上下文窗口支持： 使用戶能夠順利應對複雜任務。

為什麼 Seed-OSS 是重要的

大多數開源模型通常像一個黑盒子：您無法控制 AI 的思考程度，長文檔迅速達到上下文限制，並且成本隨任務複雜度不可預測地上升。Seed-OSS-36B-Instruct 改變了這一點：

靈活控制思維預算： 用戶可以靈活調整推理長度以匹配任務複雜度，平衡準確性、效率和成本。在 multiples of 512 tokens 中設置預算（使用 0 進行瞬時直接響應），讓開發人員對不同部署場景中的性能進行控制 - 特別適合應用於 客戶支持 或 自主代理。
原生長上下文： 不像其他模型只是後來改造，Seed-OSS 原生訓練了高達 512K 的長上下文。換句話說，即使有大量輸入，也能提供更穩定和一致的性能。
高級推理和代理智能： 專門針對複雜推理任務進行優化，同時保持平衡的通用能力，在代理工作流中表現卓越，如工具使用、多步驟問題解決和問題解決。

此外，Seed-OSS-36B-Instruct 匹配或超越 其同類中的頂級開源模型性能，包括 Qwen3-30B-A3B-Thinking-2507, Qwen3-32B 和 OAI-OSS-20B, 在數學、編碼、推理、代理任務和長上下文處理任務方面。

基準	Seed-OSS-36B-Instruct	Qwen3-30B-A3B-Thinking-2507	Qwen3-32B	OAI-OSS-20B	Gemma3-27B
知識
MMLU-Pro	🥇82.7	81.9	81.8	76.2	67.5
MMLU	🥇87.4	86.9	86.2	81.7	76.9
GPQA-D	71.4	71.4	66.7	72.2	42.4
數學
AIME24	91.7	87.7	82.7	92.7
AIME25	84.7	81.3	73.3	90.3
推理
HLE	10.1	8.7	6.9	12.7
編碼
LiveCodeBench v6	🥇67.4	60.3	53.4	63.8
代理
TAU1-零售	🥇70.4	58.7	40.9	54.8
SWE-Bench 已驗證	🥇47	39.7	23.4	60.7
長上下文
RULER (128K)	🥇94.6	94.5	77.5	78.7

現實應用場景

思維預算在實踐中是如何工作的？當您設置思維預算時，該模型透明運行。以下是一個設置為 512 的思維預算示例：在推理過程中，該模型會定期觸發自我反思，以估算已消耗的和剩餘的預算，並在預算耗盡或推理結束後交付最終響應。

<seed:think>
Got it, let's try to solve this problem step by step. The problem says ... ...
<seed:cot_budget_reflect>I have used 129 tokens, and there are 383 tokens remaining for use.</seed:cot_budget_reflect>
Using the power rule, ... ...
<seed:cot_budget_reflect>I have used 258 tokens, and there are 254 tokens remaining for use.</seed:cot_budget_reflect>
Alternatively, remember that ... ...
<seed:cot_budget_reflect>I have used 393 tokens, and there are 119 tokens remaining for use.</seed:cot_budget_reflect>
Because if ... ...
<seed:cot_budget_reflect>I have exhausted my token budget, and now I will start answering the question.</seed:cot_budget_reflect>
</seed:think>
To solve the problem, we start by using the properties of logarithms to simplify the given equations: (full answer omitted)

<seed:think>
Got it, let's try to solve this problem step by step. The problem says ... ...
<seed:cot_budget_reflect>I have used 129 tokens, and there are 383 tokens remaining for use.</seed:cot_budget_reflect>
Using the power rule, ... ...
<seed:cot_budget_reflect>I have used 258 tokens, and there are 254 tokens remaining for use.</seed:cot_budget_reflect>
Alternatively, remember that ... ...
<seed:cot_budget_reflect>I have used 393 tokens, and there are 119 tokens remaining for use.</seed:cot_budget_reflect>
Because if ... ...
<seed:cot_budget_reflect>I have exhausted my token budget, and now I will start answering the question.</seed:cot_budget_reflect>
</seed:think>
To solve the problem, we start by using the properties of logarithms to simplify the given equations: (full answer omitted)

<seed:think>
Got it, let's try to solve this problem step by step. The problem says ... ...
<seed:cot_budget_reflect>I have used 129 tokens, and there are 383 tokens remaining for use.</seed:cot_budget_reflect>
Using the power rule, ... ...
<seed:cot_budget_reflect>I have used 258 tokens, and there are 254 tokens remaining for use.</seed:cot_budget_reflect>
Alternatively, remember that ... ...
<seed:cot_budget_reflect>I have used 393 tokens, and there are 119 tokens remaining for use.</seed:cot_budget_reflect>
Because if ... ...
<seed:cot_budget_reflect>I have exhausted my token budget, and now I will start answering the question.</seed:cot_budget_reflect>
</seed:think>
To solve the problem, we start by using the properties of logarithms to simplify the given equations: (full answer omitted)

這種可控的推理結合先進的代理智能能力開創了強大的用例：

適應性客戶支持：
根據查詢複雜性進行AI推理縮放：快速回答常見問題，對技術問題進行深入分析。在保持簡單和複雜客戶互動的服務質量同時控制成本。
企業文件智能：
支持從長文檔中提取和分析信息，例如合規手冊、合同捆綁或法規框架。在保留上下文連接的同時跨多個相關文檔工作。
智能開發工作流：
在沒有思維預算的情況下快速進行語法檢查，使用完整推理力量進行全面架構審查。在單個會話中處理整個代碼庫，而不是獨立的代碼片段。
全球運營：
通過原生多語言能力在國際市場部署一致的AI支持。支持跨司法轄區研究、文化適應洞察和統一工作流中的區域市場分析。

無論您是優化客戶支持效率、處理龐大的文檔庫、簡化開發工作流，還是擴展全球運營，該模型都能適應您的特定需求，同時保持透明性和成本可預測性。

立即開始

探索：在 Seed-OSS-36B-Instruct 的 SiliconFlow 體驗中心中進行嘗試。
整合：使用我們的 OpenAI 兼容 API。在 SiliconFlow API 文件中探索完整的 API 說明。

import requests

url = "https://api.siliconflow.com/v1/chat/completions"

payload = {
    "model": "ByteDance-Seed/Seed-OSS-36B-Instruct",
    "messages": [
        {
            "role": "user",
            "content": "tell me a story"
        }
    ]
}
headers = {
    "Authorization": "Bearer <token>",
    "Content-Type": "application/json"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)

import requests

url = "https://api.siliconflow.com/v1/chat/completions"

payload = {
    "model": "ByteDance-Seed/Seed-OSS-36B-Instruct",
    "messages": [
        {
            "role": "user",
            "content": "tell me a story"
        }
    ]
}
headers = {
    "Authorization": "Bearer <token>",
    "Content-Type": "application/json"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)

import requests

url = "https://api.siliconflow.com/v1/chat/completions"

payload = {
    "model": "ByteDance-Seed/Seed-OSS-36B-Instruct",
    "messages": [
        {
            "role": "user",
            "content": "tell me a story"
        }
    ]
}
headers = {
    "Authorization": "Bearer <token>",
    "Content-Type": "application/json"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)