Kimi-K2 在 SiliconFlow 上：为 AI 代理量身定制，并定价以适应

2025年7月15日

Kimi K2 的关键技术亮点：

大型训练：在没有训练不稳定的情况下预训练了一个 1T 参数的 MoE 模型，使用 15.5T tokens。
MuonClip 优化器：将 Muon 优化器应用于前所未有的规模，并开发新的优化技术以解决扩展期间的不稳定性。
代理智能：为 AI 代理定制——工具使用、推理和自主解决问题。

立即开始

探索：在 SiliconFlow 模型广场试用 Kimi-K2-Instruct。整合：使用我们的 OpenAI 兼容 API。在 SiliconFlow API 文档中探索完整的 API 规范。

from openai import OpenAI

url = 'https://api.siliconflow.com/v1/'
api_key = 'your_api_key'

client = OpenAI(
    base_url=url,
    api_key=api_key
)

# Send a request with streaming output
content = ""
reasoning_content = ""
messages = [
    {"role": "user", "content": "Explain the concept of gravitational waves in Chinese?"}
]
response = client.chat.completions.create(
    model="moonshotai/Kimi-K2-Instruct",
    messages=messages,
    stream=True,  # Enable streaming output
    max_tokens=8192
)
# Gradually receive and process the response
for chunk in response:
    if chunk.choices[0].delta.content:
        content += chunk.choices[0].delta.content

from openai import OpenAI

url = 'https://api.siliconflow.com/v1/'
api_key = 'your_api_key'

client = OpenAI(
    base_url=url,
    api_key=api_key
)

# Send a request with streaming output
content = ""
reasoning_content = ""
messages = [
    {"role": "user", "content": "Explain the concept of gravitational waves in Chinese?"}
]
response = client.chat.completions.create(
    model="moonshotai/Kimi-K2-Instruct",
    messages=messages,
    stream=True,  # Enable streaming output
    max_tokens=8192
)
# Gradually receive and process the response
for chunk in response:
    if chunk.choices[0].delta.content:
        content += chunk.choices[0].delta.content

from openai import OpenAI

url = 'https://api.siliconflow.com/v1/'
api_key = 'your_api_key'

client = OpenAI(
    base_url=url,
    api_key=api_key
)

# Send a request with streaming output
content = ""
reasoning_content = ""
messages = [
    {"role": "user", "content": "Explain the concept of gravitational waves in Chinese?"}
]
response = client.chat.completions.create(
    model="moonshotai/Kimi-K2-Instruct",
    messages=messages,
    stream=True,  # Enable streaming output
    max_tokens=8192
)
# Gradually receive and process the response
for chunk in response:
    if chunk.choices[0].delta.content:
        content += chunk.choices[0].delta.content