State-of-the-Art

AI Model Library

One API to run inference on 200+ cutting-edge AI models, and deploy in seconds

State-of-the-Art

AI Model Library

One API to run inference on 200+ cutting-edge AI models, and deploy in seconds

State-of-the-Art

AI Model Library

One API to run inference on 200+ cutting-edge AI models, and deploy in seconds

Z.ai

Text Generation

GLM-5.1

GLM-5.1 is Z.ai's next-generation flagship model built for agentic engineering. It is designed to run continuously for hours or even longer, refining its strategy as it works—the longer it runs, the better the results....

Total Context:

205K

Max output:

131K

Input:

$

1.4

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

4.4

/ M Tokens

Z.ai

Text Generation

GLM-5V-Turbo

GLM-5V-Turbo is Zhipu’s latest flagship multimodal foundation model, optimized for multimodal coding and agent capabilities. It supports up to 200K tokens of image, video, and text context, and, when integrated with frameworks such as Claude Code and OpenClaw, can handle complex long-horizon programming and assistant tasks....

Total Context:

205K

Max output:

131K

Input:

$

1.2

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

4.0

/ M Tokens

Z.ai

Text Generation

GLM-5

GLM-5 is a next-generation open-source model for complex systems engineering and long-horizon agentic tasks, scaled to ~744B sparse parameters (~40B active) with ~28.5T pretraining tokens. It integrates DeepSeek Sparse Attention (DSA) to retain long-context capacity while reducing inference cost, and leverages the “slime” asynchronous RL stack to deliver strong performance in reasoning, coding, and agentic benchmarks....

Total Context:

205K

Max output:

131K

Input:

$

0.95

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

2.55

/ M Tokens

Z.ai

Text Generation

GLM-4.7

GLM-4.7 is Zhipu’s new-generation flagship model, with 355B total parameters and 32B activated parameters, delivering comprehensive upgrades in general conversation, reasoning, and agent capabilities. Responses are more concise and natural; writing feels more immersive; tool-call instructions are followed more reliably; and the front-end polish of artifacts and agentic coding—along with long-horizon task completion efficiency—has been further improved....

Total Context:

205K

Max output:

205K

Input:

$

0.42

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

2.2

/ M Tokens

Z.ai

Text Generation

GLM-4.6V

GLM-4.6V achieves SOTA (State-of-the-Art) accuracy in visual understanding among models of the same parameter scale. For the first time, it natively integrates Function Call capabilities into the visual model architecture, bridging the gap between "Visual Perception" and "Executable Action." This provides a unified technical foundation for multimodal Agents in real-world business scenarios. Additionally, the visual context window has been expanded to 128k, supporting long video stream processing and high-resolution multi-image analysis....

Total Context:

131K

Max output:

131K

Input:

$

0.3

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

0.9

/ M Tokens

Z.ai

Text Generation

GLM-4.6

Compared with GLM-4.5, GLM-4.6 brings several key improvements, including longer context window expanded to 200K tokens, superior coding performance, advanced reasoning, more capable agents, and refined writing....

Total Context:

205K

Max output:

205K

Input:

$

0.39

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

1.9

/ M Tokens

Z.ai

Text Generation

GLM-4.5-Air

The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. It’s also a hybrid reasoning model providing both thinking and non-thinking mode. ...

Total Context:

131K

Max output:

131K

Input:

$

0.14

/ M Tokens

Cached Input:

$

text

/ M Tokens

Output:

$

0.86

/ M Tokens

Ready to accelerate your AI development?

Ready to accelerate your AI development?

Ready to accelerate your AI development?