DeepSeek-R1-Distill-Qwen-1.5B
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
DeepSeek-R1-Distill-Qwen-1.5B is a distilled model based on Qwen2.5-Math-1.5B. The model was fine-tuned using 800k curated samples generated by DeepSeek-R1 and demonstrates decent performance across various benchmarks. As a lightweight model, it achieved 83.9% accuracy on MATH-500, 28.9% pass rate on AIME 2024, and a rating of 954 on CodeForces, showing reasoning capabilities beyond its parameter scale
API Usage
Details
Model Provider
deepseek-ai
Type
text
Sub Type
chat
Size
2
Publish Time
Jan 20, 2025
Input Price
$
0.02
/ M Tokens
Output Price
$
0.02
/ M Tokens
Context length
32768
Tags
Reasoning,1.5B,32K,Math