Transparent Pricing
High-performance inference at competitive prices. Pay only for what you use with no hidden fees or commitments.
Transparent Pricing
High-performance inference at competitive prices. Pay only for what you use with no hidden fees or commitments.

Begin exploring our models and APIs with no commitment required.
Start for free
Begin exploring our models and APIs with no commitment required.
Start for free
Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.
Simple integration
Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.
Simple integration
Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.
Simple integration
Only pay for what you use. Set spending limits and monitor usage through our dashboard.
Pay-as-you-go
Only pay for what you use. Set spending limits and monitor usage through our dashboard.
Pay-as-you-go
Only pay for what you use. Set spending limits and monitor usage through our dashboard.
Pay-as-you-go
Serverless Pricing
Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!
Serverless Pricing
Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!
Serverless Pricing
Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!
Image Generation
Generate high-quality images from text prompts with our state-of-the-art image generation models.
Model Name
Price (/image)
FLUX 1.1 [pro]
$
0.04
Model Name
FLUX 1.1 [pro]
$
0.04
Price (
)
/ Image
FLUX 1.1 [pro]
$
0.04
FLUX 1.1 [pro] Ultra
$
0.06
Model Name
FLUX 1.1 [pro] Ultra
$
0.06
Price (
)
/ Image
FLUX 1.1 [pro] Ultra
$
0.06
FLUX.1-dev
$
0.014
Model Name
FLUX.1-dev
$
0.014
Price (
)
/ Image
FLUX.1-dev
$
0.014
FLUX.1-Kontext-dev
$
0.015
Model Name
FLUX.1-Kontext-dev
$
0.015
Price (
)
/ Image
FLUX.1-Kontext-dev
$
0.015
FLUX.1 Kontext [max]
$
0.08
Model Name
FLUX.1 Kontext [max]
$
0.08
Price (
)
/ Image
FLUX.1 Kontext [max]
$
0.08
FLUX.1 Kontext [pro]
$
0.04
Model Name
FLUX.1 Kontext [pro]
$
0.04
Price (
)
/ Image
FLUX.1 Kontext [pro]
$
0.04
FLUX.1-schnell
$
0.0014
Model Name
FLUX.1-schnell
$
0.0014
Price (
)
/ Image
FLUX.1-schnell
$
0.0014
Prices shown are per image generated or edited.
Prices shown are per image generated or edited.
Prices shown are per image generated or edited.
Video Generation
Create dynamic videos from text descriptions with our cutting-edge video generation models.
Model Name
Price (/video)
Wan2.1-I2V-14B-720P
$
0.29
Model Name
Wan2.1-I2V-14B-720P
$
0.29
Price (
)
/ Video
Wan2.1-I2V-14B-720P
$
0.29
Wan2.1-I2V-14B-720P (Turbo)
$
0.21
Model Name
Wan2.1-I2V-14B-720P (Turbo)
$
0.21
Price (
)
/ Video
Wan2.1-I2V-14B-720P (Turbo)
$
0.21
Wan2.1-T2V-14B
$
0.29
Model Name
Wan2.1-T2V-14B
$
0.29
Price (
)
/ Video
Wan2.1-T2V-14B
$
0.29
Wan2.1-T2V-14B (Turbo)
$
0.21
Model Name
Wan2.1-T2V-14B (Turbo)
$
0.21
Price (
)
/ Video
Wan2.1-T2V-14B (Turbo)
$
0.21
Wan2.2-I2V-A14B
$
0.29
Model Name
Wan2.2-I2V-A14B
$
0.29
Price (
)
/ Video
Wan2.2-I2V-A14B
$
0.29
Wan2.2-T2V-A14B
$
0.29
Model Name
Wan2.2-T2V-A14B
$
0.29
Price (
)
/ Video
Wan2.2-T2V-A14B
$
0.29
Prices shown are per video generated.
Prices shown are per video generated.
Prices shown are per video generated.
LLM
High-performance language models for conversational AI applications, with competitive per-token pricing.
Model Name
Context Length
Context Length
Input (/M Tokens)
Output (/M Tokens)
ERNIE-4.5-300B-A47B
131K
$
0.28
$
1.1
Model Name
ERNIE-4.5-300B-A47B
131K
Context Length
Input (/M Tokens)
$
1.1
$
1.1
Output (/M Tokens)
ERNIE-4.5-300B-A47B
131K
$
0.28
$
1.1
Seed-OSS-36B-Instruct
262K
$
0.21
$
0.57
Model Name
Seed-OSS-36B-Instruct
262K
Context Length
Input (/M Tokens)
$
0.57
$
0.57
Output (/M Tokens)
Seed-OSS-36B-Instruct
262K
$
0.21
$
0.57
DeepSeek-R1
164K
$
0.5
$
2.18
Model Name
DeepSeek-R1
164K
Context Length
Input (/M Tokens)
$
2.18
$
2.18
Output (/M Tokens)
DeepSeek-R1
164K
$
0.5
$
2.18
DeepSeek-R1-Distill-Qwen-14B
131K
$
0.1
$
0.1
Model Name
DeepSeek-R1-Distill-Qwen-14B
131K
Context Length
Input (/M Tokens)
$
0.1
$
0.1
Output (/M Tokens)
DeepSeek-R1-Distill-Qwen-14B
131K
$
0.1
$
0.1
DeepSeek-R1-Distill-Qwen-32B
131K
$
0.18
$
0.18
Model Name
DeepSeek-R1-Distill-Qwen-32B
131K
Context Length
Input (/M Tokens)
$
0.18
$
0.18
Output (/M Tokens)
DeepSeek-R1-Distill-Qwen-32B
131K
$
0.18
$
0.18
DeepSeek-R1-Distill-Qwen-7B
33K
$
0.05
$
0.05
Model Name
DeepSeek-R1-Distill-Qwen-7B
33K
Context Length
Input (/M Tokens)
$
0.05
$
0.05
Output (/M Tokens)
DeepSeek-R1-Distill-Qwen-7B
33K
$
0.05
$
0.05
DeepSeek-V3
164K
$
0.27
$
1.13
Model Name
DeepSeek-V3
164K
Context Length
Input (/M Tokens)
$
1.13
$
1.13
Output (/M Tokens)
DeepSeek-V3
164K
$
0.27
$
1.13
DeepSeek-V3.1
164K
$
0.27
$
1.1
Model Name
DeepSeek-V3.1
164K
Context Length
Input (/M Tokens)
$
1.1
$
1.1
Output (/M Tokens)
DeepSeek-V3.1
164K
$
0.27
$
1.1
DeepSeek-VL2
4K
$
0.15
$
0.15
Model Name
DeepSeek-VL2
4K
Context Length
Input (/M Tokens)
$
0.15
$
0.15
Output (/M Tokens)
DeepSeek-VL2
4K
$
0.15
$
0.15
Ling-mini-2.0
131K
$
0.07
$
0.29
Model Name
Ling-mini-2.0
131K
Context Length
Input (/M Tokens)
$
0.29
$
0.29
Output (/M Tokens)
Ling-mini-2.0
131K
$
0.07
$
0.29
Meta-Llama-3.1-8B-Instruct
33K
$
0.06
$
0.06
Model Name
Meta-Llama-3.1-8B-Instruct
33K
Context Length
Input (/M Tokens)
$
0.06
$
0.06
Output (/M Tokens)
Meta-Llama-3.1-8B-Instruct
33K
$
0.06
$
0.06
MiniMax-M1-80k
131K
$
0.55
$
2.2
Model Name
MiniMax-M1-80k
131K
Context Length
Input (/M Tokens)
$
2.2
$
2.2
Output (/M Tokens)
MiniMax-M1-80k
131K
$
0.55
$
2.2
Kimi-Dev-72B
131K
$
0.29
$
1.15
Model Name
Kimi-Dev-72B
131K
Context Length
Input (/M Tokens)
$
1.15
$
1.15
Output (/M Tokens)
Kimi-Dev-72B
131K
$
0.29
$
1.15
Kimi-K2-Instruct
131K
$
0.58
$
2.29
Model Name
Kimi-K2-Instruct
131K
Context Length
Input (/M Tokens)
$
2.29
$
2.29
Output (/M Tokens)
Kimi-K2-Instruct
131K
$
0.58
$
2.29
Kimi-K2-Instruct-0905
262K
$
0.58
$
2.29
Model Name
Kimi-K2-Instruct-0905
262K
Context Length
Input (/M Tokens)
$
2.29
$
2.29
Output (/M Tokens)
Kimi-K2-Instruct-0905
262K
$
0.58
$
2.29
gpt-oss-120b
131K
$
0.09
$
0.45
Model Name
gpt-oss-120b
131K
Context Length
Input (/M Tokens)
$
0.45
$
0.45
Output (/M Tokens)
gpt-oss-120b
131K
$
0.09
$
0.45
gpt-oss-20b
131K
$
0.04
$
0.18
Model Name
gpt-oss-20b
131K
Context Length
Input (/M Tokens)
$
0.18
$
0.18
Output (/M Tokens)
gpt-oss-20b
131K
$
0.04
$
0.18
Qwen2.5-14B-Instruct
33K
$
0.1
$
0.1
Model Name
Qwen2.5-14B-Instruct
33K
Context Length
Input (/M Tokens)
$
0.1
$
0.1
Output (/M Tokens)
Qwen2.5-14B-Instruct
33K
$
0.1
$
0.1
Qwen2.5-32B-Instruct
33K
$
0.18
$
0.18
Model Name
Qwen2.5-32B-Instruct
33K
Context Length
Input (/M Tokens)
$
0.18
$
0.18
Output (/M Tokens)
Qwen2.5-32B-Instruct
33K
$
0.18
$
0.18
Qwen2.5-72B-Instruct
33K
$
0.59
$
0.59
Model Name
Qwen2.5-72B-Instruct
33K
Context Length
Input (/M Tokens)
$
0.59
$
0.59
Output (/M Tokens)
Qwen2.5-72B-Instruct
33K
$
0.59
$
0.59
Qwen2.5-72B-Instruct-128K
131K
$
0.59
$
0.59
Model Name
Qwen2.5-72B-Instruct-128K
131K
Context Length
Input (/M Tokens)
$
0.59
$
0.59
Output (/M Tokens)
Qwen2.5-72B-Instruct-128K
131K
$
0.59
$
0.59
Qwen2.5-7B-Instruct
33K
$
0.05
$
0.05
Model Name
Qwen2.5-7B-Instruct
33K
Context Length
Input (/M Tokens)
$
0.05
$
0.05
Output (/M Tokens)
Qwen2.5-7B-Instruct
33K
$
0.05
$
0.05
Qwen2.5-Coder-32B-Instruct
33K
$
0.18
$
0.18
Model Name
Qwen2.5-Coder-32B-Instruct
33K
Context Length
Input (/M Tokens)
$
0.18
$
0.18
Output (/M Tokens)
Qwen2.5-Coder-32B-Instruct
33K
$
0.18
$
0.18
Qwen2.5-VL-32B-Instruct
131K
$
0.27
$
0.27
Model Name
Qwen2.5-VL-32B-Instruct
131K
Context Length
Input (/M Tokens)
$
0.27
$
0.27
Output (/M Tokens)
Qwen2.5-VL-32B-Instruct
131K
$
0.27
$
0.27
Qwen2.5-VL-72B-Instruct
131K
$
0.59
$
0.59
Model Name
Qwen2.5-VL-72B-Instruct
131K
Context Length
Input (/M Tokens)
$
0.59
$
0.59
Output (/M Tokens)
Qwen2.5-VL-72B-Instruct
131K
$
0.59
$
0.59
Qwen2.5-VL-7B-Instruct
33K
$
0.05
$
0.05
Model Name
Qwen2.5-VL-7B-Instruct
33K
Context Length
Input (/M Tokens)
$
0.05
$
0.05
Output (/M Tokens)
Qwen2.5-VL-7B-Instruct
33K
$
0.05
$
0.05
Qwen3-14B
131K
$
0.07
$
0.28
Model Name
Qwen3-14B
131K
Context Length
Input (/M Tokens)
$
0.28
$
0.28
Output (/M Tokens)
Qwen3-14B
131K
$
0.07
$
0.28
Qwen3-235B-A22B
131K
$
0.35
$
1.42
Model Name
Qwen3-235B-A22B
131K
Context Length
Input (/M Tokens)
$
1.42
$
1.42
Output (/M Tokens)
Qwen3-235B-A22B
131K
$
0.35
$
1.42
Qwen3-235B-A22B-2507
262K
$
0.35
$
1.42
Model Name
Qwen3-235B-A22B-2507
262K
Context Length
Input (/M Tokens)
$
1.42
$
1.42
Output (/M Tokens)
Qwen3-235B-A22B-2507
262K
$
0.35
$
1.42
Qwen3-235B-A22B-Thinking-2507
262K
$
0.35
$
1.42
Model Name
Qwen3-235B-A22B-Thinking-2507
262K
Context Length
Input (/M Tokens)
$
1.42
$
1.42
Output (/M Tokens)
Qwen3-235B-A22B-Thinking-2507
262K
$
0.35
$
1.42
Qwen3-30B-A3B
131K
$
0.09
$
0.45
Model Name
Qwen3-30B-A3B
131K
Context Length
Input (/M Tokens)
$
0.45
$
0.45
Output (/M Tokens)
Qwen3-30B-A3B
131K
$
0.09
$
0.45
Qwen3-30B-A3B-Instruct-2507
262K
$
0.1
$
0.4
Model Name
Qwen3-30B-A3B-Instruct-2507
262K
Context Length
Input (/M Tokens)
$
0.4
$
0.4
Output (/M Tokens)
Qwen3-30B-A3B-Instruct-2507
262K
$
0.1
$
0.4
Qwen3-30B-A3B-Thinking-2507
262K
$
0.1
$
0.4
Model Name
Qwen3-30B-A3B-Thinking-2507
262K
Context Length
Input (/M Tokens)
$
0.4
$
0.4
Output (/M Tokens)
Qwen3-30B-A3B-Thinking-2507
262K
$
0.1
$
0.4
Qwen3-32B
131K
$
0.14
$
0.57
Model Name
Qwen3-32B
131K
Context Length
Input (/M Tokens)
$
0.57
$
0.57
Output (/M Tokens)
Qwen3-32B
131K
$
0.14
$
0.57
Qwen3-8B
131K
$
0.06
$
0.06
Model Name
Qwen3-8B
131K
Context Length
Input (/M Tokens)
$
0.06
$
0.06
Output (/M Tokens)
Qwen3-8B
131K
$
0.06
$
0.06
Qwen3-Coder-30B-A3B-Instruct
262K
$
0.1
$
0.4
Model Name
Qwen3-Coder-30B-A3B-Instruct
262K
Context Length
Input (/M Tokens)
$
0.4
$
0.4
Output (/M Tokens)
Qwen3-Coder-30B-A3B-Instruct
262K
$
0.1
$
0.4
Qwen3-Coder-480B-A35B
262K
$
1.14
$
2.28
Model Name
Qwen3-Coder-480B-A35B
262K
Context Length
Input (/M Tokens)
$
2.28
$
2.28
Output (/M Tokens)
Qwen3-Coder-480B-A35B
262K
$
1.14
$
2.28
Qwen3-Embedding-0.6B
33K
$
0.01
$
0
Model Name
Qwen3-Embedding-0.6B
33K
Context Length
Input (/M Tokens)
$
0.01
$
0.01
Output (/M Tokens)
Qwen3-Embedding-0.6B
33K
$
0.01
$
0
Qwen3-Embedding-4B
33K
$
0.02
$
0
Model Name
Qwen3-Embedding-4B
33K
Context Length
Input (/M Tokens)
$
0.02
$
0.02
Output (/M Tokens)
Qwen3-Embedding-4B
33K
$
0.02
$
0
Qwen3-Embedding-8B
33K
$
0.04
$
0
Model Name
Qwen3-Embedding-8B
33K
Context Length
Input (/M Tokens)
$
0.04
$
0.04
Output (/M Tokens)
Qwen3-Embedding-8B
33K
$
0.04
$
0
Qwen3-Reranker-0.6B
33K
$
0.01
$
0
Model Name
Qwen3-Reranker-0.6B
33K
Context Length
Input (/M Tokens)
$
0.01
$
0.01
Output (/M Tokens)
Qwen3-Reranker-0.6B
33K
$
0.01
$
0
Qwen3-Reranker-4B
33K
$
0.02
$
0
Model Name
Qwen3-Reranker-4B
33K
Context Length
Input (/M Tokens)
$
0.02
$
0.02
Output (/M Tokens)
Qwen3-Reranker-4B
33K
$
0.02
$
0
Qwen3-Reranker-8B
33K
$
0.04
$
0
Model Name
Qwen3-Reranker-8B
33K
Context Length
Input (/M Tokens)
$
0.04
$
0.04
Output (/M Tokens)
Qwen3-Reranker-8B
33K
$
0.04
$
0
QwQ-32B
131K
$
0.15
$
0.58
Model Name
QwQ-32B
131K
Context Length
Input (/M Tokens)
$
0.58
$
0.58
Output (/M Tokens)
QwQ-32B
131K
$
0.15
$
0.58
step3
66K
$
0.57
$
1.42
Model Name
step3
66K
Context Length
Input (/M Tokens)
$
1.42
$
1.42
Output (/M Tokens)
step3
66K
$
0.57
$
1.42
Hunyuan-A13B-Instruct
131K
$
0.14
$
0.57
Model Name
Hunyuan-A13B-Instruct
131K
Context Length
Input (/M Tokens)
$
0.57
$
0.57
Output (/M Tokens)
Hunyuan-A13B-Instruct
131K
$
0.14
$
0.57
GLM-4-32B-0414
33K
$
0.27
$
0.27
Model Name
GLM-4-32B-0414
33K
Context Length
Input (/M Tokens)
$
0.27
$
0.27
Output (/M Tokens)
GLM-4-32B-0414
33K
$
0.27
$
0.27
GLM-4-9B-0414
33K
$
0.086
$
0.086
Model Name
GLM-4-9B-0414
33K
Context Length
Input (/M Tokens)
$
0.086
$
0.086
Output (/M Tokens)
GLM-4-9B-0414
33K
$
0.086
$
0.086
GLM-4.1V-9B-Thinking
66K
$
0.035
$
0.14
Model Name
GLM-4.1V-9B-Thinking
66K
Context Length
Input (/M Tokens)
$
0.14
$
0.14
Output (/M Tokens)
GLM-4.1V-9B-Thinking
66K
$
0.035
$
0.14
GLM-Z1-32B-0414
131K
$
0.14
$
0.57
Model Name
GLM-Z1-32B-0414
131K
Context Length
Input (/M Tokens)
$
0.57
$
0.57
Output (/M Tokens)
GLM-Z1-32B-0414
131K
$
0.14
$
0.57
GLM-Z1-9B-0414
131K
$
0.086
$
0.086
Model Name
GLM-Z1-9B-0414
131K
Context Length
Input (/M Tokens)
$
0.086
$
0.086
Output (/M Tokens)
GLM-Z1-9B-0414
131K
$
0.086
$
0.086
GLM-4.5
131K
$
0.5
$
2
Model Name
GLM-4.5
131K
Context Length
Input (/M Tokens)
$
2
$
2
Output (/M Tokens)
GLM-4.5
131K
$
0.5
$
2
GLM-4.5-Air
131K
$
0.14
$
0.86
Model Name
GLM-4.5-Air
131K
Context Length
Input (/M Tokens)
$
0.86
$
0.86
Output (/M Tokens)
GLM-4.5-Air
131K
$
0.14
$
0.86
GLM-4.5V
66K
$
0.14
$
0.86
Model Name
GLM-4.5V
66K
Context Length
Input (/M Tokens)
$
0.86
$
0.86
Output (/M Tokens)
GLM-4.5V
66K
$
0.14
$
0.86
Prices shown are per 1 million tokens.
Prices shown are per 1 million tokens.
Prices shown are per 1 million tokens.
Audio Models
Process and generate audio with our high-quality speech recognition and synthesis models.
Model Name
Output (/M UTF-8 bytes)
Fish-Speech-1.5
$
15
Model Name
Fish-Speech-1.5
$
15
Price (
)
/ M UTF-8 bytes
Fish-Speech-1.5
$
15
FunAudioLLM/CosyVoice2-0.5B
$
7.15
Model Name
FunAudioLLM/CosyVoice2-0.5B
$
7.15
Price (
)
/ M UTF-8 bytes
FunAudioLLM/CosyVoice2-0.5B
$
7.15
IndexTTS-2
$
7.15
Model Name
IndexTTS-2
$
7.15
Price (
)
/ M UTF-8 bytes
IndexTTS-2
$
7.15
Prices for transcription and translation are per minute of audio. Text-to-Speech prices are per 1,000 characters.
Prices for transcription and translation are per minute of audio. Text-to-Speech prices are per 1,000 characters.
Prices for transcription and translation are per minute of audio. Text-to-Speech prices are per 1,000 characters.

Frequently asked questions
How does billing work?
You're billed based on your usage. For chat models, you're charged per token for both input and output. For image, video, and audio models, pricing varies based on the specific task and output quality.
Are there any minimum commitments?
No, there are no minimum commitments. You only pay for what you use, and you can start with $1 in free credits.
Can I set spending limits?
Yes, you can set monthly spending limits in your account dashboard to control costs and prevent unexpected charges.
Do you offer volume discounts?
Yes, we offer volume discounts for high-usage customers. If your usage is substantial, please contact our sales team who can create a custom pricing plan tailored to your needs.
How do I get started?
Sign up for an account, get your API key, and start using our models right away. We provide comprehensive documentation and code examples to help you integrate quickly.
Ready to accelerate your AI development?
Ready to accelerate your AI development?


