DeepSeek-R1-Distill-Qwen-14B
deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-14B is a distilled model based on Qwen2.5-14B. The model was fine-tuned using 800k curated samples generated by DeepSeek-R1 and demonstrates strong reasoning capabilities. It achieved impressive results across various benchmarks, including 93.9% accuracy on MATH-500, 69.7% pass rate on AIME 2024, and a rating of 1481 on CodeForces, showcasing its powerful abilities in mathematics and programming tasks
Details
Model Provider
deepseek-ai
Type
text
Sub Type
chat
Size
14
Publish Time
Jan 20, 2025
Input Price
$
0.1
/ M Tokens
Output Price
$
0.1
/ M Tokens
Context length
32768
Tags
Reasoning,14B,32K