정보에 대해서DeepSeek-R1-Distill-Qwen-1.5B
DeepSeek-R1-Distill-Qwen-1.5B는 Qwen2.5-Math-1.5B를 기반으로 한 축소된 모델입니다. 이 Model은 DeepSeek-R1에 의해 생성된 80만 개의 큐레이션된 샘플을 사용하여 미세 조정되었으며 다양한 벤치마크에서 괜찮은 성능을 보여줍니다. 경량 Model로서 MATH-500에서 83.9%의 정확도, AIME 2024에서 28.9%의 통과율, CodeForces에서 954의 등급을 달성하여 파라미터 규모를 넘는 추론 능력을 보여줍니다.
Explore how DeepSeek-V3's advanced reasoning and coding capabilities translate into real-world applications.
Automated Code Generation & Debugging
Generate, optimize, and debug complex code snippets across various programming languages. The model's strong reasoning helps identify logical errors and suggest efficient solutions.
Use Case Example:
"A software engineer used DeepSeek-V3 to refactor a legacy Python module, resulting in a 40% reduction in code complexity and a 25% improvement in execution speed."
Scientific & Mathematical Research
Assist researchers by solving complex mathematical problems, formulating hypotheses, and analyzing data. Its ability to reason through abstract concepts makes it a powerful tool for scientific discovery.
Use Case Example:
"A physicist modeled a complex quantum mechanics problem, and the model provided a step-by-step derivation that led to a novel insight, which was later verified experimentally."
Intelligent Agent & Tool Integration
Build sophisticated AI agents that can understand user requests, select the appropriate tools (e.g., APIs, databases), and execute multi-step tasks autonomously.
Use Case Example:
"An automated travel assistant powered by DeepSeek-V3 booked a complete itinerary by interacting with flight, hotel, and car rental APIs based on a single natural language request from the user."
Advanced Conversational AI
Create highly engaging and context-aware chatbots, virtual assistants, or role-playing characters for gaming and entertainment. The model excels at maintaining coherent and natural-sounding dialogue.
Use Case Example:
"A gaming company implemented an NPC (Non-Player Character) using the model, which provided dynamic, unscripted interactions that significantly enhanced player immersion."
메타데이터
사양
주
Deprecated
건축
교정된
아니요
전문가의 혼합
아니요
총 매개변수
2B
활성화된 매개변수
추론
아니요
Precision
FP8
콘텍스트 길이
33K
Max Tokens
다른 모델과 비교
이 Model이 다른 것들과 어떻게 비교되는지 보세요.
DeepSeek
chat
DeepSeek-V3.2
출시일: 2025. 12. 4.
Total Context:
164K
Max output:
164K
Input:
$
0.27
/ M Tokens
Output:
$
0.42
/ M Tokens
DeepSeek
chat
DeepSeek-V3.2-Exp
출시일: 2025. 10. 10.
Total Context:
164K
Max output:
164K
Input:
$
0.27
/ M Tokens
Output:
$
0.41
/ M Tokens
DeepSeek
chat
DeepSeek-V3.1-Terminus
출시일: 2025. 9. 29.
Total Context:
164K
Max output:
164K
Input:
$
0.27
/ M Tokens
Output:
$
1.0
/ M Tokens
DeepSeek
chat
DeepSeek-V3.1
출시일: 2025. 8. 25.
Total Context:
164K
Max output:
164K
Input:
$
0.27
/ M Tokens
Output:
$
1.0
/ M Tokens
DeepSeek
chat
DeepSeek-V3
출시일: 2024. 12. 26.
Total Context:
164K
Max output:
164K
Input:
$
0.25
/ M Tokens
Output:
$
1.0
/ M Tokens
DeepSeek
chat
DeepSeek-R1
출시일: 2025. 5. 28.
Total Context:
164K
Max output:
164K
Input:
$
0.5
/ M Tokens
Output:
$
2.18
/ M Tokens
DeepSeek
chat
DeepSeek-R1-Distill-Qwen-32B
출시일: 2025. 1. 20.
Total Context:
131K
Max output:
131K
Input:
$
0.18
/ M Tokens
Output:
$
0.18
/ M Tokens
DeepSeek
chat
DeepSeek-R1-Distill-Qwen-14B
출시일: 2025. 1. 20.
Total Context:
131K
Max output:
131K
Input:
$
0.1
/ M Tokens
Output:
$
0.1
/ M Tokens
DeepSeek
chat
DeepSeek-R1-Distill-Qwen-7B
출시일: 2025. 1. 20.
Total Context:
33K
Max output:
16K
Input:
$
0.05
/ M Tokens
Output:
$
0.05
/ M Tokens
