Hunyuan-A13B-Instruct 现已在 SiliconFlow 上可用

2025年6月30日

目录

腾讯混元AI团队宣布推出Hunyuan-A13B-Instruct,这是一个开源的大型语言模型(LLM),现已在SiliconFlow平台上线。

该模型基于细粒度专家混合(MoE)架构构建,以高效扩展80B总参数,但仅使用13B活跃参数,实现跨多个基准测试的最先进性能,尤其在数学、科学、代理领域等方面。

SiliconFlow支持:

  • 扩展上下文:默认128K token上下文窗口(请求时提供256K)。

  • 成本优化定价:每百万tokens(Input)0.14美元和每百万tokens(Output)0.57美元。

为什么Hunyuan-A13B-Instruct重要?

  • 紧凑但强大:仅13亿活跃参数(总共80亿参数),模型在多种基准任务中提供具有竞争力的表现,媲美更大的模型。

  • 混合推理支持:支持快速和慢速思维模式,允许用户根据需要灵活选择。

  • 超长上下文理解:原生支持256K上下文窗口,在长Text任务中保持稳定性能。

  • 增强的代理能力:为代理任务优化,在BFCL-v3, τ-Bench和C3-Bench等基准上取得领先结果。

  • 高效推理:利用分组查询注意力(GQA)并支持多种量化格式,实现高效推理。

快速开始

直接在SiliconFlow模型广场上试用Hunyuan-A13B-Instruct模型。

快速访问API

以下Python示例演示了如何使用SiliconFlow的API端点调用Hunyuan-A13B-Instruct模型。有关更多规格,请参阅SiliconFlow API文档

from openai import OpenAI

url = 'https://api.siliconflow.com/v1/'
api_key = 'your_api_key'

client = OpenAI(
    base_url=url,
    api_key=api_key
)

# Send a request with streaming output
content = ""
reasoning_content = ""
messages = [
    {"role": "user", "content": "How do you implement a binary search algorithm in Python with detailed comments?"}
]
response = client.chat.completions.create(
    model="tencent/Hunyuan-A13B-Instruct",
    messages=messages,
    stream=True,  # Enable streaming output
    max_tokens=4096,
    extra_body={
        "thinking_budget": 1024
    }
)
# Gradually receive and process the response
for chunk in response:
    if chunk.choices[0].delta.content:
        content += chunk.choices[0].delta.content
    if chunk.choices[0].delta.reasoning_content:
        reasoning_content += chunk.choices[0].delta.reasoning_content

# Round 2
messages.append({"role": "assistant", "content": content})
messages.append({'role': 'user', 'content': "Continue"})
response = client.chat.completions.create(
    model="tencent/Hunyuan-A13B-Instruct",
    messages=messages,
    stream=True
)
from openai import OpenAI

url = 'https://api.siliconflow.com/v1/'
api_key = 'your_api_key'

client = OpenAI(
    base_url=url,
    api_key=api_key
)

# Send a request with streaming output
content = ""
reasoning_content = ""
messages = [
    {"role": "user", "content": "How do you implement a binary search algorithm in Python with detailed comments?"}
]
response = client.chat.completions.create(
    model="tencent/Hunyuan-A13B-Instruct",
    messages=messages,
    stream=True,  # Enable streaming output
    max_tokens=4096,
    extra_body={
        "thinking_budget": 1024
    }
)
# Gradually receive and process the response
for chunk in response:
    if chunk.choices[0].delta.content:
        content += chunk.choices[0].delta.content
    if chunk.choices[0].delta.reasoning_content:
        reasoning_content += chunk.choices[0].delta.reasoning_content

# Round 2
messages.append({"role": "assistant", "content": content})
messages.append({'role': 'user', 'content': "Continue"})
response = client.chat.completions.create(
    model="tencent/Hunyuan-A13B-Instruct",
    messages=messages,
    stream=True
)
from openai import OpenAI

url = 'https://api.siliconflow.com/v1/'
api_key = 'your_api_key'

client = OpenAI(
    base_url=url,
    api_key=api_key
)

# Send a request with streaming output
content = ""
reasoning_content = ""
messages = [
    {"role": "user", "content": "How do you implement a binary search algorithm in Python with detailed comments?"}
]
response = client.chat.completions.create(
    model="tencent/Hunyuan-A13B-Instruct",
    messages=messages,
    stream=True,  # Enable streaming output
    max_tokens=4096,
    extra_body={
        "thinking_budget": 1024
    }
)
# Gradually receive and process the response
for chunk in response:
    if chunk.choices[0].delta.content:
        content += chunk.choices[0].delta.content
    if chunk.choices[0].delta.reasoning_content:
        reasoning_content += chunk.choices[0].delta.reasoning_content

# Round 2
messages.append({"role": "assistant", "content": content})
messages.append({'role': 'user', 'content': "Continue"})
response = client.chat.completions.create(
    model="tencent/Hunyuan-A13B-Instruct",
    messages=messages,
    stream=True
)

Hunyuan-A13B-Instruct是研究人员和开发人员寻求高性能的理想选择。无论是用于学术研究、成本效益高的AI解决方案开发,还是创新应用探索,这个模型都为进步提供了坚实的基础。

立即在SiliconFlow开始构建Hunyuan-A13B-Instruct!

准备好 加速您的人工智能开发吗?

准备好 加速您的人工智能开发吗?

准备好 加速您的人工智能开发吗?