The Tencent Hunyuan AI team announced the release of Hunyuan-A13B-Instruct, an open-source large language model (LLM) now available on the SiliconFlow platform.
Built on a fine-grained Mixture-of-Experts (MoE) architecture, the model efficiently scales 80B total parameters with only 13B active parameters, achieving state-of-the-art performance across multiple benchmarks—particularly in mathematics, science, agent domains, and more.
SiliconFlow supports:
Extended Context: Default 128K token context windows(256K available upon request).
Cost-Optimized Pricing: 0.14/M tokens(input) and 0.57/M tokens (output).
Why Hunyuan-A13B-Instruct matters?
Compact yet Powerful: With only 13 billion active parameters (out of a total of 80 billion), the model delivers competitive performance on a wide range of benchmark tasks, rivaling much larger models.
Hybrid Reasoning Support: Supports both fast and slow thinking modes, allowing users to flexibly choose according to their needs.
Ultra-Long Context Understanding: Natively supports a 256K context window, maintaining stable performance on long-text tasks.
Enhanced Agent Capabilities: Optimized for agent tasks, achieving leading results on benchmarks such as BFCL-v3, τ-Bench and C3-Bench.
Efficient Inference: Utilizes Grouped Query Attention (GQA) and supports multiple quantization formats, enabling highly efficient inference.
Quick Start
Try the Hunyuan-A13B-Instruct model directly on the SiliconFlow playground.
Quick Access to API
The following Python example demonstrates how to invoke the Hunyuan-A13B-Instruct model using SiliconFlow's API endpoint. For more specifications, please refer to the SiliconFlow API documentation.
from openai import OpenAI
url = 'https://api.siliconflow.com/v1/'
api_key = 'your_api_key'
client = OpenAI(
base_url=url,
api_key=api_key
)
content = ""
reasoning_content = ""
messages = [
{"role": "user", "content": "How do you implement a binary search algorithm in Python with detailed comments?"}
]
response = client.chat.completions.create(
model="tencent/Hunyuan-A13B-Instruct",
messages=messages,
stream=True,
max_tokens=4096,
extra_body={
"thinking_budget": 1024
}
)
for chunk in response:
if chunk.choices[0].delta.content:
content += chunk.choices[0].delta.content
if chunk.choices[0].delta.reasoning_content:
reasoning_content += chunk.choices[0].delta.reasoning_content
messages.append({"role": "assistant", "content": content})
messages.append({'role': 'user', 'content': "Continue"})
response = client.chat.completions.create(
model="tencent/Hunyuan-A13B-Instruct",
messages=messages,
stream=True
)from openai import OpenAI
url = 'https://api.siliconflow.com/v1/'
api_key = 'your_api_key'
client = OpenAI(
base_url=url,
api_key=api_key
)
content = ""
reasoning_content = ""
messages = [
{"role": "user", "content": "How do you implement a binary search algorithm in Python with detailed comments?"}
]
response = client.chat.completions.create(
model="tencent/Hunyuan-A13B-Instruct",
messages=messages,
stream=True,
max_tokens=4096,
extra_body={
"thinking_budget": 1024
}
)
for chunk in response:
if chunk.choices[0].delta.content:
content += chunk.choices[0].delta.content
if chunk.choices[0].delta.reasoning_content:
reasoning_content += chunk.choices[0].delta.reasoning_content
messages.append({"role": "assistant", "content": content})
messages.append({'role': 'user', 'content': "Continue"})
response = client.chat.completions.create(
model="tencent/Hunyuan-A13B-Instruct",
messages=messages,
stream=True
)from openai import OpenAI
url = 'https://api.siliconflow.com/v1/'
api_key = 'your_api_key'
client = OpenAI(
base_url=url,
api_key=api_key
)
content = ""
reasoning_content = ""
messages = [
{"role": "user", "content": "How do you implement a binary search algorithm in Python with detailed comments?"}
]
response = client.chat.completions.create(
model="tencent/Hunyuan-A13B-Instruct",
messages=messages,
stream=True,
max_tokens=4096,
extra_body={
"thinking_budget": 1024
}
)
for chunk in response:
if chunk.choices[0].delta.content:
content += chunk.choices[0].delta.content
if chunk.choices[0].delta.reasoning_content:
reasoning_content += chunk.choices[0].delta.reasoning_content
messages.append({"role": "assistant", "content": content})
messages.append({'role': 'user', 'content': "Continue"})
response = client.chat.completions.create(
model="tencent/Hunyuan-A13B-Instruct",
messages=messages,
stream=True
)Hunyuan-A13B-Instruct is an ideal choice for researchers and developers seeking high performance. Whether for academic research, cost-effective AI solution development, or innovative application exploration, this model provides a robust foundation for advancement.
Start building with Hunyuan-A13B-Instruct today at SiliconFlow!