DeepSeek-R1-Distill-Llama-8B
deepseek-ai/DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Llama-8B is a distilled model based on Llama-3.1-8B. The model was fine-tuned using samples generated by DeepSeek-R1 and demonstrates strong reasoning capabilities. It achieved notable results across various benchmarks, including 89.1% accuracy on MATH-500, 50.4% pass rate on AIME 2024, and a rating of 1205 on CodeForces, showing impressive mathematical and programming abilities for an 8B-scale model
Details
Model Provider
deepseek-ai
Type
text
Sub Type
chat
Size
8
Publish Time
Jan 20, 2025
Input Price
$
0.06
/ M Tokens
Output Price
$
0.06
/ M Tokens
Context length
32768
Tags
Reasoning,8B,32K