GLM-Z1-32B-0414
THUDM/GLM-Z1-32B-0414
GLM-Z1-32B-0414 is a reasoning model with deep thinking capabilities. This model was developed based on GLM-4-32B-0414 through cold start and extended reinforcement learning, as well as further training on tasks involving mathematics, code, and logic. Compared to the base model, GLM-Z1-32B-0414 significantly improves mathematical abilities and the capability to solve complex tasks. During the training process, the team also introduced general reinforcement learning based on pairwise ranking feedback, further enhancing the model's general capabilities. Despite having only 32B parameters, its performance on certain tasks is comparable to DeepSeek-R1 with 671B parameters. Through evaluations on benchmarks such as AIME 24/25, LiveCodeBench, and GPQA, the model demonstrates strong mathematical reasoning abilities and can support solutions for a wider range of complex tasks
Details
Model Provider
THUDM
Type
text
Sub Type
chat
Size
32
Publish Time
Apr 18, 2025
Input Price
$
0.14
/ M Tokens
Output Price
$
0.57
/ M Tokens
Context length
32768
Tags
32B,32K,Reasoning