Hunyuan-A13B-Instruct
tencent/Hunyuan-A13B-Instruct
Hunyuan-A13B-Instruct activates only 13 B of its 80 B parameters, yet matches much larger LLMs on mainstream benchmarks. It offers hybrid reasoning: low-latency “fast” mode or high-precision “slow” mode, switchable per call. Native 256 K-token context lets it digest book-length documents without degradation. Agent skills are tuned for BFCL-v3, τ-Bench and C3-Bench leadership, making it an excellent autonomous assistant backbone. Grouped Query Attention plus multi-format quantization delivers memory-light, GPU-efficient inference for real-world deployment, with built-in multilingual support and robust safety alignment for enterprise-grade applications.
Details
Model Provider
hunyuan
Type
text
Sub Type
chat
Size
80
Publish Time
Jun 30, 2025
Input Price
$
0.14
/ M Tokens
Output Price
$
0.57
/ M Tokens
Context length
131072
Tags
Reasoning,MoE,80B,128K