🎉 GLM-4.7 Now on SiliconFlow: Advanced Coding, Reasoning & Tool Use Capabilities

Models

Products

Pricing

Docs

Blog

About

Contact

State-of-the-Art

AI Model Library

One API to run inference on 200+ cutting-edge AI models, and deploy in seconds

State-of-the-Art

AI Model Library

One API to run inference on 200+ cutting-edge AI models, and deploy in seconds

State-of-the-Art

AI Model Library

One API to run inference on 200+ cutting-edge AI models, and deploy in seconds

All

Featured

LLM

Vision

Image

Video

Audio

Serverless

Tencent

Tencent

Text Generation

Hunyuan-MT-7B

Release on: Sep 18, 2025

The Hunyuan Translation Model consists of a translation model, Hunyuan-MT-7B, and an ensemble model, Hunyuan-MT-Chimera. Hunyuan-MT-7B is a lightweight translation model with 7 billion parameters used to translate source text into the target language. The model supports mutual translation among 33 languages, including five ethnic minority languages in China. In the WMT25 machine translation competition, Hunyuan-MT-7B won first place in 30 out of the 31 language categories it participated in, demonstrating its outstanding translation capabilities. For translation tasks, Tencent Hunyuan proposed a comprehensive training framework covering pre-training, supervised fine-tuning, translation enhancement, and ensemble refinement, achieving state-of-the-art performance among models of a similar scale. The model is computationally efficient and easy to deploy, making it suitable for various application scenarios...

Total Context:

33K

Max output:

33K

Input:

0.0

/ M Tokens

Output:

0.0

/ M Tokens

Tencent

Text Generation

Hunyuan-A13B-Instruct

Release on: Jun 30, 2025

Hunyuan-A13B-Instruct activates only 13 B of its 80 B parameters, yet matches much larger LLMs on mainstream benchmarks. It offers hybrid reasoning: low-latency “fast” mode or high-precision “slow” mode, switchable per call. Native 256 K-token context lets it digest book-length documents without degradation. Agent skills are tuned for BFCL-v3, τ-Bench and C3-Bench leadership, making it an excellent autonomous assistant backbone. Grouped Query Attention plus multi-format quantization delivers memory-light, GPU-efficient inference for real-world deployment, with built-in multilingual support and robust safety alignment for enterprise-grade applications....

Total Context:

131K

Max output:

131K

Input:

0.14

/ M Tokens

Output:

0.57

/ M Tokens

Ready to accelerate your AI development?