
Ling-flash-2.0 API, Deployment, Pricing
inclusionAI/Ling-flash-2.0
Ling-flash-2.0 is a language model from inclusionAI with a total of 100 billion parameters, of which 6.1 billion are activated per token (4.8 billion non-embedding). As part of the Ling 2.0 architecture series, it is designed as a lightweight yet powerful Mixture-of-Experts (MoE) model. It aims to deliver performance comparable to or even exceeding that of 40B-level dense models and other larger MoE models, but with a significantly smaller active parameter count. The model represents a strategy focused on achieving high performance and efficiency through extreme architectural design and training methods
Details
Model Provider
inclusionAI
Type
text
Sub Type
chat
Size
100B
Publish Time
Sep 18, 2025
Input Price
$
0.14
/ M Tokens
Output Price
$
0.57
/ M Tokens
Context length
131K
Tags
MoE,106B,A6B,131K
Compare with Other Models
See how this model stacks up against others.