Transparent Pricing
High-performance inference at competitive prices. Pay only for what you use with no hidden fees or commitments.
Transparent Pricing
High-performance inference at competitive prices. Pay only for what you use with no hidden fees or commitments.

Begin exploring our models and APIs with no commitment required.
Start for free
Begin exploring our models and APIs with no commitment required.
Start for free
Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.
Simple integration
Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.
Simple integration
Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.
Simple integration
Only pay for what you use. Set spending limits and monitor usage through our dashboard.
Pay-as-you-go
Only pay for what you use. Set spending limits and monitor usage through our dashboard.
Pay-as-you-go
Only pay for what you use. Set spending limits and monitor usage through our dashboard.
Pay-as-you-go
Serverless Pricing
Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!
Serverless Pricing
Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!
Serverless Pricing
Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!
Image Generation
Generate high-quality images from text prompts with our state-of-the-art image generation models.
Model Name
Price (/image)
FLUX 1.1 [pro]
$
0.04
Model Name
FLUX 1.1 [pro]
$
0.04
Price (
)
/ Image
FLUX 1.1 [pro]
$
0.04
FLUX 1.1 [pro] Ultra
$
0.06
Model Name
FLUX 1.1 [pro] Ultra
$
0.06
Price (
)
/ Image
FLUX 1.1 [pro] Ultra
$
0.06
FLUX.1-dev
$
0.014
Model Name
FLUX.1-dev
$
0.014
Price (
)
/ Image
FLUX.1-dev
$
0.014
FLUX.1-Kontext-dev
$
0.015
Model Name
FLUX.1-Kontext-dev
$
0.015
Price (
)
/ Image
FLUX.1-Kontext-dev
$
0.015
FLUX.1 Kontext [max]
$
0.08
Model Name
FLUX.1 Kontext [max]
$
0.08
Price (
)
/ Image
FLUX.1 Kontext [max]
$
0.08
FLUX.1 Kontext [pro]
$
0.04
Model Name
FLUX.1 Kontext [pro]
$
0.04
Price (
)
/ Image
FLUX.1 Kontext [pro]
$
0.04
FLUX.1-schnell
$
0.0014
Model Name
FLUX.1-schnell
$
0.0014
Price (
)
/ Image
FLUX.1-schnell
$
0.0014
Prices shown are per image generated or edited.
Prices shown are per image generated or edited.
Prices shown are per image generated or edited.
Video Generation
Create dynamic videos from text descriptions with our cutting-edge video generation models.
Model Name
Price (/video)
Wan2.1-I2V-14B-720P
$
0.29
Model Name
Wan2.1-I2V-14B-720P
$
0.29
Price (
)
/ Video
Wan2.1-I2V-14B-720P
$
0.29
Wan2.1-I2V-14B-720P (Turbo)
$
0.21
Model Name
Wan2.1-I2V-14B-720P (Turbo)
$
0.21
Price (
)
/ Video
Wan2.1-I2V-14B-720P (Turbo)
$
0.21
Wan2.1-T2V-14B
$
0.29
Model Name
Wan2.1-T2V-14B
$
0.29
Price (
)
/ Video
Wan2.1-T2V-14B
$
0.29
Wan2.1-T2V-14B (Turbo)
$
0.21
Model Name
Wan2.1-T2V-14B (Turbo)
$
0.21
Price (
)
/ Video
Wan2.1-T2V-14B (Turbo)
$
0.21
Prices shown are per video generated.
Prices shown are per video generated.
Prices shown are per video generated.
LLM
High-performance language models for conversational AI applications, with competitive per-token pricing.
Model Name
Context Length
Context Length
Input (/M Tokens)
Output (/M Tokens)
ERNIE-4.5-300B-A47B
128K
$
0.29
$
1.15
Model Name
ERNIE-4.5-300B-A47B
128K
Context Length
Input (/M Tokens)
$
1.15
$
1.15
Output (/M Tokens)
ERNIE-4.5-300B-A47B
128K
$
0.29
$
1.15
DeepSeek-R1
160K
$
0.58
$
2.29
Model Name
DeepSeek-R1
160K
Context Length
Input (/M Tokens)
$
2.29
$
2.29
Output (/M Tokens)
DeepSeek-R1
160K
$
0.58
$
2.29
DeepSeek-R1-Distill-Llama-70B
32K
$
0.59
$
0.59
Model Name
DeepSeek-R1-Distill-Llama-70B
32K
Context Length
Input (/M Tokens)
$
0.59
$
0.59
Output (/M Tokens)
DeepSeek-R1-Distill-Llama-70B
32K
$
0.59
$
0.59
DeepSeek-R1-Distill-Llama-8B
32K
$
0.06
$
0.06
Model Name
DeepSeek-R1-Distill-Llama-8B
32K
Context Length
Input (/M Tokens)
$
0.06
$
0.06
Output (/M Tokens)
DeepSeek-R1-Distill-Llama-8B
32K
$
0.06
$
0.06
DeepSeek-R1-Distill-Qwen-14B
32K
$
0.1
$
0.1
Model Name
DeepSeek-R1-Distill-Qwen-14B
32K
Context Length
Input (/M Tokens)
$
0.1
$
0.1
Output (/M Tokens)
DeepSeek-R1-Distill-Qwen-14B
32K
$
0.1
$
0.1
DeepSeek-R1-Distill-Qwen-32B
32K
$
0.18
$
0.18
Model Name
DeepSeek-R1-Distill-Qwen-32B
32K
Context Length
Input (/M Tokens)
$
0.18
$
0.18
Output (/M Tokens)
DeepSeek-R1-Distill-Qwen-32B
32K
$
0.18
$
0.18
DeepSeek-R1-Distill-Qwen-7B
32K
$
0.05
$
0.05
Model Name
DeepSeek-R1-Distill-Qwen-7B
32K
Context Length
Input (/M Tokens)
$
0.05
$
0.05
Output (/M Tokens)
DeepSeek-R1-Distill-Qwen-7B
32K
$
0.05
$
0.05
DeepSeek-V3
128K
$
0.29
$
1.15
Model Name
DeepSeek-V3
128K
Context Length
Input (/M Tokens)
$
1.15
$
1.15
Output (/M Tokens)
DeepSeek-V3
128K
$
0.29
$
1.15
DeepSeek-VL2
4K
$
0.15
$
0.15
Model Name
DeepSeek-VL2
4K
Context Length
Input (/M Tokens)
$
0.15
$
0.15
Output (/M Tokens)
DeepSeek-VL2
4K
$
0.15
$
0.15
Llama-3.3-70B-Instruct
32K
$
0.59
$
0.59
Model Name
Llama-3.3-70B-Instruct
32K
Context Length
Input (/M Tokens)
$
0.59
$
0.59
Output (/M Tokens)
Llama-3.3-70B-Instruct
32K
$
0.59
$
0.59
Meta-Llama-3.1-8B-Instruct
32K
$
0.06
$
0.06
Model Name
Meta-Llama-3.1-8B-Instruct
32K
Context Length
Input (/M Tokens)
$
0.06
$
0.06
Output (/M Tokens)
Meta-Llama-3.1-8B-Instruct
32K
$
0.06
$
0.06
MiniMax-M1-80k
128K
$
0.58
$
2.29
Model Name
MiniMax-M1-80k
128K
Context Length
Input (/M Tokens)
$
2.29
$
2.29
Output (/M Tokens)
MiniMax-M1-80k
128K
$
0.58
$
2.29
Kimi-K2-Instruct
128K
$
0.58
$
2.29
Model Name
Kimi-K2-Instruct
128K
Context Length
Input (/M Tokens)
$
2.29
$
2.29
Output (/M Tokens)
Kimi-K2-Instruct
128K
$
0.58
$
2.29
Qwen2.5-14B-Instruct
32K
$
0.1
$
0.1
Model Name
Qwen2.5-14B-Instruct
32K
Context Length
Input (/M Tokens)
$
0.1
$
0.1
Output (/M Tokens)
Qwen2.5-14B-Instruct
32K
$
0.1
$
0.1
Qwen2.5-32B-Instruct
32K
$
0.18
$
0.18
Model Name
Qwen2.5-32B-Instruct
32K
Context Length
Input (/M Tokens)
$
0.18
$
0.18
Output (/M Tokens)
Qwen2.5-32B-Instruct
32K
$
0.18
$
0.18
Qwen2.5-72B-Instruct
32K
$
0.59
$
0.59
Model Name
Qwen2.5-72B-Instruct
32K
Context Length
Input (/M Tokens)
$
0.59
$
0.59
Output (/M Tokens)
Qwen2.5-72B-Instruct
32K
$
0.59
$
0.59
Qwen2.5-72B-Instruct-128K
128K
$
0.59
$
0.59
Model Name
Qwen2.5-72B-Instruct-128K
128K
Context Length
Input (/M Tokens)
$
0.59
$
0.59
Output (/M Tokens)
Qwen2.5-72B-Instruct-128K
128K
$
0.59
$
0.59
Qwen2.5-7B-Instruct
32K
$
0.05
$
0.05
Model Name
Qwen2.5-7B-Instruct
32K
Context Length
Input (/M Tokens)
$
0.05
$
0.05
Output (/M Tokens)
Qwen2.5-7B-Instruct
32K
$
0.05
$
0.05
Qwen2.5-Coder-32B-Instruct
32K
$
0.18
$
0.18
Model Name
Qwen2.5-Coder-32B-Instruct
32K
Context Length
Input (/M Tokens)
$
0.18
$
0.18
Output (/M Tokens)
Qwen2.5-Coder-32B-Instruct
32K
$
0.18
$
0.18
Qwen2.5-VL-32B-Instruct
128K
$
0.27
$
0.27
Model Name
Qwen2.5-VL-32B-Instruct
128K
Context Length
Input (/M Tokens)
$
0.27
$
0.27
Output (/M Tokens)
Qwen2.5-VL-32B-Instruct
128K
$
0.27
$
0.27
Qwen2.5-VL-72B-Instruct
128K
$
0.59
$
0.59
Model Name
Qwen2.5-VL-72B-Instruct
128K
Context Length
Input (/M Tokens)
$
0.59
$
0.59
Output (/M Tokens)
Qwen2.5-VL-72B-Instruct
128K
$
0.59
$
0.59
Qwen2.5-VL-7B-Instruct
32K
$
0.05
$
0.05
Model Name
Qwen2.5-VL-7B-Instruct
32K
Context Length
Input (/M Tokens)
$
0.05
$
0.05
Output (/M Tokens)
Qwen2.5-VL-7B-Instruct
32K
$
0.05
$
0.05
Qwen3-14B
128K
$
0.07
$
0.28
Model Name
Qwen3-14B
128K
Context Length
Input (/M Tokens)
$
0.28
$
0.28
Output (/M Tokens)
Qwen3-14B
128K
$
0.07
$
0.28
Qwen3-235B-A22B
128K
$
0.35
$
1.42
Model Name
Qwen3-235B-A22B
128K
Context Length
Input (/M Tokens)
$
1.42
$
1.42
Output (/M Tokens)
Qwen3-235B-A22B
128K
$
0.35
$
1.42
Qwen3-235B-A22B-2507
256K
$
0.35
$
1.42
Model Name
Qwen3-235B-A22B-2507
256K
Context Length
Input (/M Tokens)
$
1.42
$
1.42
Output (/M Tokens)
Qwen3-235B-A22B-2507
256K
$
0.35
$
1.42
Qwen3-235B-A22B-Thinking-2507
256K
$
0.35
$
1.42
Model Name
Qwen3-235B-A22B-Thinking-2507
256K
Context Length
Input (/M Tokens)
$
1.42
$
1.42
Output (/M Tokens)
Qwen3-235B-A22B-Thinking-2507
256K
$
0.35
$
1.42
Qwen3-30B-A3B
128K
$
0.1
$
0.4
Model Name
Qwen3-30B-A3B
128K
Context Length
Input (/M Tokens)
$
0.4
$
0.4
Output (/M Tokens)
Qwen3-30B-A3B
128K
$
0.1
$
0.4
Qwen3-32B
128K
$
0.14
$
0.57
Model Name
Qwen3-32B
128K
Context Length
Input (/M Tokens)
$
0.57
$
0.57
Output (/M Tokens)
Qwen3-32B
128K
$
0.14
$
0.57
Qwen3-8B
128K
$
0.06
$
0.06
Model Name
Qwen3-8B
128K
Context Length
Input (/M Tokens)
$
0.06
$
0.06
Output (/M Tokens)
Qwen3-8B
128K
$
0.06
$
0.06
Qwen3-Embedding-0.6B
32K
$
0.01
$
0
Model Name
Qwen3-Embedding-0.6B
32K
Context Length
Input (/M Tokens)
$
0.01
$
0.01
Output (/M Tokens)
Qwen3-Embedding-0.6B
32K
$
0.01
$
0
Qwen3-Embedding-4B
32K
$
0.02
$
0
Model Name
Qwen3-Embedding-4B
32K
Context Length
Input (/M Tokens)
$
0.02
$
0.02
Output (/M Tokens)
Qwen3-Embedding-4B
32K
$
0.02
$
0
Qwen3-Embedding-8B
32K
$
0.04
$
0
Model Name
Qwen3-Embedding-8B
32K
Context Length
Input (/M Tokens)
$
0.04
$
0.04
Output (/M Tokens)
Qwen3-Embedding-8B
32K
$
0.04
$
0
Qwen3-Reranker-0.6B
32K
$
0.01
$
0
Model Name
Qwen3-Reranker-0.6B
32K
Context Length
Input (/M Tokens)
$
0.01
$
0.01
Output (/M Tokens)
Qwen3-Reranker-0.6B
32K
$
0.01
$
0
Qwen3-Reranker-4B
32K
$
0.02
$
0
Model Name
Qwen3-Reranker-4B
32K
Context Length
Input (/M Tokens)
$
0.02
$
0.02
Output (/M Tokens)
Qwen3-Reranker-4B
32K
$
0.02
$
0
Qwen3-Reranker-8B
32K
$
0.04
$
0
Model Name
Qwen3-Reranker-8B
32K
Context Length
Input (/M Tokens)
$
0.04
$
0.04
Output (/M Tokens)
Qwen3-Reranker-8B
32K
$
0.04
$
0
QwQ-32B
32K
$
0.15
$
0.58
Model Name
QwQ-32B
32K
Context Length
Input (/M Tokens)
$
0.58
$
0.58
Output (/M Tokens)
QwQ-32B
32K
$
0.15
$
0.58
Hunyuan-A13B-Instruct
128K
$
0.14
$
0.57
Model Name
Hunyuan-A13B-Instruct
128K
Context Length
Input (/M Tokens)
$
0.57
$
0.57
Output (/M Tokens)
Hunyuan-A13B-Instruct
128K
$
0.14
$
0.57
GLM-4-32B-0414
32K
$
0.27
$
0.27
Model Name
GLM-4-32B-0414
32K
Context Length
Input (/M Tokens)
$
0.27
$
0.27
Output (/M Tokens)
GLM-4-32B-0414
32K
$
0.27
$
0.27
GLM-4-9B-0414
32K
$
0.086
$
0.086
Model Name
GLM-4-9B-0414
32K
Context Length
Input (/M Tokens)
$
0.086
$
0.086
Output (/M Tokens)
GLM-4-9B-0414
32K
$
0.086
$
0.086
GLM-4.1V-9B-Thinking
64K
$
0.035
$
0.14
Model Name
GLM-4.1V-9B-Thinking
64K
Context Length
Input (/M Tokens)
$
0.14
$
0.14
Output (/M Tokens)
GLM-4.1V-9B-Thinking
64K
$
0.035
$
0.14
GLM-Z1-32B-0414
32K
$
0.14
$
0.57
Model Name
GLM-Z1-32B-0414
32K
Context Length
Input (/M Tokens)
$
0.57
$
0.57
Output (/M Tokens)
GLM-Z1-32B-0414
32K
$
0.14
$
0.57
GLM-Z1-9B-0414
32K
$
0.086
$
0.086
Model Name
GLM-Z1-9B-0414
32K
Context Length
Input (/M Tokens)
$
0.086
$
0.086
Output (/M Tokens)
GLM-Z1-9B-0414
32K
$
0.086
$
0.086
GLM-4.5
128K
$
0.5
$
2
Model Name
GLM-4.5
128K
Context Length
Input (/M Tokens)
$
2
$
2
Output (/M Tokens)
GLM-4.5
128K
$
0.5
$
2
GLM-4.5-Air
128K
$
0.14
$
0.86
Model Name
GLM-4.5-Air
128K
Context Length
Input (/M Tokens)
$
0.86
$
0.86
Output (/M Tokens)
GLM-4.5-Air
128K
$
0.14
$
0.86
Prices shown are per 1 million tokens.
Prices shown are per 1 million tokens.
Prices shown are per 1 million tokens.
Audio Models
Process and generate audio with our high-quality speech recognition and synthesis models.
Model Name
Output (/M UTF-8 bytes)
Fish-Speech-1.5
$
15
Model Name
Fish-Speech-1.5
$
15
Price (
)
/ M UTF-8 bytes
Fish-Speech-1.5
$
15
FunAudioLLM/CosyVoice2-0.5B
$
7.15
Model Name
FunAudioLLM/CosyVoice2-0.5B
$
7.15
Price (
)
/ M UTF-8 bytes
FunAudioLLM/CosyVoice2-0.5B
$
7.15
Prices for transcription and translation are per minute of audio. Text-to-Speech prices are per 1,000 characters.
Prices for transcription and translation are per minute of audio. Text-to-Speech prices are per 1,000 characters.
Prices for transcription and translation are per minute of audio. Text-to-Speech prices are per 1,000 characters.

Frequently asked questions
How does billing work?
You're billed based on your usage. For chat models, you're charged per token for both input and output. For image, video, and audio models, pricing varies based on the specific task and output quality.
Are there any minimum commitments?
No, there are no minimum commitments. You only pay for what you use, and you can start with $1 in free credits.
Can I set spending limits?
Yes, you can set monthly spending limits in your account dashboard to control costs and prevent unexpected charges.
Do you offer volume discounts?
Yes, we offer volume discounts for high-usage customers. If your usage is substantial, please contact our sales team who can create a custom pricing plan tailored to your needs.
How do I get started?
Sign up for an account, get your API key, and start using our models right away. We provide comprehensive documentation and code examples to help you integrate quickly.
Ready to accelerate your AI development?
Ready to accelerate your AI development?


