DeepSeek-R1-Distill-Qwen-1.5B

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

DeepSeek-R1-Distill-Qwen-1.5B is a distilled model based on Qwen2.5-Math-1.5B. The model was fine-tuned using 800k curated samples generated by DeepSeek-R1 and demonstrates decent performance across various benchmarks. As a lightweight model, it achieved 83.9% accuracy on MATH-500, 28.9% pass rate on AIME 2024, and a rating of 954 on CodeForces, showing reasoning capabilities beyond its parameter scale

Details

Model Provider

deepseek-ai

Type

text

Sub Type

chat

Size

2

Publish Time

Jan 20, 2025

Input Price

$

0.02

/ M Tokens

Output Price

$

0.02

/ M Tokens

Context length

32768

Tags

Reasoning,1.5B,32K,Math

Ready to accelerate your AI development?

Ready to accelerate your AI development?

© 2025 SiliconFlow Technology PTE. LTD.

© 2025 SiliconFlow Technology PTE. LTD.

© 2025 SiliconFlow Technology PTE. LTD.