Transparent Pricing

High-performance inference at competitive prices. Pay only for what you use with no hidden fees or commitments.

Transparent Pricing

High-performance inference at competitive prices. Pay only for what you use with no hidden fees or commitments.

Begin exploring our models and APIs with no commitment required.

Start for free

Begin exploring our models and APIs with no commitment required.

Start for free

Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.

Simple integration

Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.

Simple integration

Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.

Simple integration

Only pay for what you use. Set spending limits and monitor usage through our dashboard.

Pay-as-you-go

Only pay for what you use. Set spending limits and monitor usage through our dashboard.

Pay-as-you-go

Only pay for what you use. Set spending limits and monitor usage through our dashboard.

Pay-as-you-go

Serverless Pricing

Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!

Serverless Pricing

Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!

Serverless Pricing

Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!

LLM

High-performance language models for conversational AI applications, with competitive per-token pricing.

Model Name

Context Length

Context Length

Input (/M Tokens)

Output (/M Tokens)

ERNIE-4.5-300B-A47B

128K

$

0.29

$

1.15

Model Name

ERNIE-4.5-300B-A47B

128K

Context Length

Input (/M Tokens)

$

1.15

$

1.15

Output (/M Tokens)

ERNIE-4.5-300B-A47B

128K

$

0.29

$

1.15

DeepSeek-R1

160K

$

0.58

$

2.29

Model Name

DeepSeek-R1

160K

Context Length

Input (/M Tokens)

$

2.29

$

2.29

Output (/M Tokens)

DeepSeek-R1

160K

$

0.58

$

2.29

DeepSeek-R1-Distill-Llama-70B

32K

$

0.59

$

0.59

Model Name

DeepSeek-R1-Distill-Llama-70B

32K

Context Length

Input (/M Tokens)

$

0.59

$

0.59

Output (/M Tokens)

DeepSeek-R1-Distill-Llama-70B

32K

$

0.59

$

0.59

DeepSeek-R1-Distill-Llama-8B

32K

$

0.06

$

0.06

Model Name

DeepSeek-R1-Distill-Llama-8B

32K

Context Length

Input (/M Tokens)

$

0.06

$

0.06

Output (/M Tokens)

DeepSeek-R1-Distill-Llama-8B

32K

$

0.06

$

0.06

DeepSeek-R1-Distill-Qwen-14B

32K

$

0.1

$

0.1

Model Name

DeepSeek-R1-Distill-Qwen-14B

32K

Context Length

Input (/M Tokens)

$

0.1

$

0.1

Output (/M Tokens)

DeepSeek-R1-Distill-Qwen-14B

32K

$

0.1

$

0.1

DeepSeek-R1-Distill-Qwen-32B

32K

$

0.18

$

0.18

Model Name

DeepSeek-R1-Distill-Qwen-32B

32K

Context Length

Input (/M Tokens)

$

0.18

$

0.18

Output (/M Tokens)

DeepSeek-R1-Distill-Qwen-32B

32K

$

0.18

$

0.18

DeepSeek-R1-Distill-Qwen-7B

32K

$

0.05

$

0.05

Model Name

DeepSeek-R1-Distill-Qwen-7B

32K

Context Length

Input (/M Tokens)

$

0.05

$

0.05

Output (/M Tokens)

DeepSeek-R1-Distill-Qwen-7B

32K

$

0.05

$

0.05

DeepSeek-V3

128K

$

0.29

$

1.15

Model Name

DeepSeek-V3

128K

Context Length

Input (/M Tokens)

$

1.15

$

1.15

Output (/M Tokens)

DeepSeek-V3

128K

$

0.29

$

1.15

DeepSeek-VL2

4K

$

0.15

$

0.15

Model Name

DeepSeek-VL2

4K

Context Length

Input (/M Tokens)

$

0.15

$

0.15

Output (/M Tokens)

DeepSeek-VL2

4K

$

0.15

$

0.15

Llama-3.3-70B-Instruct

32K

$

0.59

$

0.59

Model Name

Llama-3.3-70B-Instruct

32K

Context Length

Input (/M Tokens)

$

0.59

$

0.59

Output (/M Tokens)

Llama-3.3-70B-Instruct

32K

$

0.59

$

0.59

Meta-Llama-3.1-8B-Instruct

32K

$

0.06

$

0.06

Model Name

Meta-Llama-3.1-8B-Instruct

32K

Context Length

Input (/M Tokens)

$

0.06

$

0.06

Output (/M Tokens)

Meta-Llama-3.1-8B-Instruct

32K

$

0.06

$

0.06

MiniMax-M1-80k

128K

$

0.58

$

2.29

Model Name

MiniMax-M1-80k

128K

Context Length

Input (/M Tokens)

$

2.29

$

2.29

Output (/M Tokens)

MiniMax-M1-80k

128K

$

0.58

$

2.29

Kimi-K2-Instruct

128K

$

0.58

$

2.29

Model Name

Kimi-K2-Instruct

128K

Context Length

Input (/M Tokens)

$

2.29

$

2.29

Output (/M Tokens)

Kimi-K2-Instruct

128K

$

0.58

$

2.29

Qwen2.5-14B-Instruct

32K

$

0.1

$

0.1

Model Name

Qwen2.5-14B-Instruct

32K

Context Length

Input (/M Tokens)

$

0.1

$

0.1

Output (/M Tokens)

Qwen2.5-14B-Instruct

32K

$

0.1

$

0.1

Qwen2.5-32B-Instruct

32K

$

0.18

$

0.18

Model Name

Qwen2.5-32B-Instruct

32K

Context Length

Input (/M Tokens)

$

0.18

$

0.18

Output (/M Tokens)

Qwen2.5-32B-Instruct

32K

$

0.18

$

0.18

Qwen2.5-72B-Instruct

32K

$

0.59

$

0.59

Model Name

Qwen2.5-72B-Instruct

32K

Context Length

Input (/M Tokens)

$

0.59

$

0.59

Output (/M Tokens)

Qwen2.5-72B-Instruct

32K

$

0.59

$

0.59

Qwen2.5-72B-Instruct-128K

128K

$

0.59

$

0.59

Model Name

Qwen2.5-72B-Instruct-128K

128K

Context Length

Input (/M Tokens)

$

0.59

$

0.59

Output (/M Tokens)

Qwen2.5-72B-Instruct-128K

128K

$

0.59

$

0.59

Qwen2.5-7B-Instruct

32K

$

0.05

$

0.05

Model Name

Qwen2.5-7B-Instruct

32K

Context Length

Input (/M Tokens)

$

0.05

$

0.05

Output (/M Tokens)

Qwen2.5-7B-Instruct

32K

$

0.05

$

0.05

Qwen2.5-Coder-32B-Instruct

32K

$

0.18

$

0.18

Model Name

Qwen2.5-Coder-32B-Instruct

32K

Context Length

Input (/M Tokens)

$

0.18

$

0.18

Output (/M Tokens)

Qwen2.5-Coder-32B-Instruct

32K

$

0.18

$

0.18

Qwen2.5-VL-32B-Instruct

128K

$

0.27

$

0.27

Model Name

Qwen2.5-VL-32B-Instruct

128K

Context Length

Input (/M Tokens)

$

0.27

$

0.27

Output (/M Tokens)

Qwen2.5-VL-32B-Instruct

128K

$

0.27

$

0.27

Qwen2.5-VL-72B-Instruct

128K

$

0.59

$

0.59

Model Name

Qwen2.5-VL-72B-Instruct

128K

Context Length

Input (/M Tokens)

$

0.59

$

0.59

Output (/M Tokens)

Qwen2.5-VL-72B-Instruct

128K

$

0.59

$

0.59

Qwen2.5-VL-7B-Instruct

32K

$

0.05

$

0.05

Model Name

Qwen2.5-VL-7B-Instruct

32K

Context Length

Input (/M Tokens)

$

0.05

$

0.05

Output (/M Tokens)

Qwen2.5-VL-7B-Instruct

32K

$

0.05

$

0.05

Qwen3-14B

128K

$

0.07

$

0.28

Model Name

Qwen3-14B

128K

Context Length

Input (/M Tokens)

$

0.28

$

0.28

Output (/M Tokens)

Qwen3-14B

128K

$

0.07

$

0.28

Qwen3-235B-A22B

128K

$

0.35

$

1.42

Model Name

Qwen3-235B-A22B

128K

Context Length

Input (/M Tokens)

$

1.42

$

1.42

Output (/M Tokens)

Qwen3-235B-A22B

128K

$

0.35

$

1.42

Qwen3-235B-A22B-2507

256K

$

0.35

$

1.42

Model Name

Qwen3-235B-A22B-2507

256K

Context Length

Input (/M Tokens)

$

1.42

$

1.42

Output (/M Tokens)

Qwen3-235B-A22B-2507

256K

$

0.35

$

1.42

Qwen3-235B-A22B-Thinking-2507

256K

$

0.35

$

1.42

Model Name

Qwen3-235B-A22B-Thinking-2507

256K

Context Length

Input (/M Tokens)

$

1.42

$

1.42

Output (/M Tokens)

Qwen3-235B-A22B-Thinking-2507

256K

$

0.35

$

1.42

Qwen3-30B-A3B

128K

$

0.1

$

0.4

Model Name

Qwen3-30B-A3B

128K

Context Length

Input (/M Tokens)

$

0.4

$

0.4

Output (/M Tokens)

Qwen3-30B-A3B

128K

$

0.1

$

0.4

Qwen3-32B

128K

$

0.14

$

0.57

Model Name

Qwen3-32B

128K

Context Length

Input (/M Tokens)

$

0.57

$

0.57

Output (/M Tokens)

Qwen3-32B

128K

$

0.14

$

0.57

Qwen3-8B

128K

$

0.06

$

0.06

Model Name

Qwen3-8B

128K

Context Length

Input (/M Tokens)

$

0.06

$

0.06

Output (/M Tokens)

Qwen3-8B

128K

$

0.06

$

0.06

Qwen3-Embedding-0.6B

32K

$

0.01

$

0

Model Name

Qwen3-Embedding-0.6B

32K

Context Length

Input (/M Tokens)

$

0.01

$

0.01

Output (/M Tokens)

Qwen3-Embedding-0.6B

32K

$

0.01

$

0

Qwen3-Embedding-4B

32K

$

0.02

$

0

Model Name

Qwen3-Embedding-4B

32K

Context Length

Input (/M Tokens)

$

0.02

$

0.02

Output (/M Tokens)

Qwen3-Embedding-4B

32K

$

0.02

$

0

Qwen3-Embedding-8B

32K

$

0.04

$

0

Model Name

Qwen3-Embedding-8B

32K

Context Length

Input (/M Tokens)

$

0.04

$

0.04

Output (/M Tokens)

Qwen3-Embedding-8B

32K

$

0.04

$

0

Qwen3-Reranker-0.6B

32K

$

0.01

$

0

Model Name

Qwen3-Reranker-0.6B

32K

Context Length

Input (/M Tokens)

$

0.01

$

0.01

Output (/M Tokens)

Qwen3-Reranker-0.6B

32K

$

0.01

$

0

Qwen3-Reranker-4B

32K

$

0.02

$

0

Model Name

Qwen3-Reranker-4B

32K

Context Length

Input (/M Tokens)

$

0.02

$

0.02

Output (/M Tokens)

Qwen3-Reranker-4B

32K

$

0.02

$

0

Qwen3-Reranker-8B

32K

$

0.04

$

0

Model Name

Qwen3-Reranker-8B

32K

Context Length

Input (/M Tokens)

$

0.04

$

0.04

Output (/M Tokens)

Qwen3-Reranker-8B

32K

$

0.04

$

0

QwQ-32B

32K

$

0.15

$

0.58

Model Name

QwQ-32B

32K

Context Length

Input (/M Tokens)

$

0.58

$

0.58

Output (/M Tokens)

QwQ-32B

32K

$

0.15

$

0.58

Hunyuan-A13B-Instruct

128K

$

0.14

$

0.57

Model Name

Hunyuan-A13B-Instruct

128K

Context Length

Input (/M Tokens)

$

0.57

$

0.57

Output (/M Tokens)

Hunyuan-A13B-Instruct

128K

$

0.14

$

0.57

GLM-4-32B-0414

32K

$

0.27

$

0.27

Model Name

GLM-4-32B-0414

32K

Context Length

Input (/M Tokens)

$

0.27

$

0.27

Output (/M Tokens)

GLM-4-32B-0414

32K

$

0.27

$

0.27

GLM-4-9B-0414

32K

$

0.086

$

0.086

Model Name

GLM-4-9B-0414

32K

Context Length

Input (/M Tokens)

$

0.086

$

0.086

Output (/M Tokens)

GLM-4-9B-0414

32K

$

0.086

$

0.086

GLM-4.1V-9B-Thinking

64K

$

0.035

$

0.14

Model Name

GLM-4.1V-9B-Thinking

64K

Context Length

Input (/M Tokens)

$

0.14

$

0.14

Output (/M Tokens)

GLM-4.1V-9B-Thinking

64K

$

0.035

$

0.14

GLM-Z1-32B-0414

32K

$

0.14

$

0.57

Model Name

GLM-Z1-32B-0414

32K

Context Length

Input (/M Tokens)

$

0.57

$

0.57

Output (/M Tokens)

GLM-Z1-32B-0414

32K

$

0.14

$

0.57

GLM-Z1-9B-0414

32K

$

0.086

$

0.086

Model Name

GLM-Z1-9B-0414

32K

Context Length

Input (/M Tokens)

$

0.086

$

0.086

Output (/M Tokens)

GLM-Z1-9B-0414

32K

$

0.086

$

0.086

GLM-4.5

128K

$

0.5

$

2

Model Name

GLM-4.5

128K

Context Length

Input (/M Tokens)

$

2

$

2

Output (/M Tokens)

GLM-4.5

128K

$

0.5

$

2

GLM-4.5-Air

128K

$

0.14

$

0.86

Model Name

GLM-4.5-Air

128K

Context Length

Input (/M Tokens)

$

0.86

$

0.86

Output (/M Tokens)

GLM-4.5-Air

128K

$

0.14

$

0.86

Prices shown are per 1 million tokens.

Prices shown are per 1 million tokens.

Prices shown are per 1 million tokens.

Audio Models

Process and generate audio with our high-quality speech recognition and synthesis models.

Model Name

Output (/M UTF-8 bytes)

Prices for transcription and translation are per minute of audio. Text-to-Speech prices are per 1,000 characters.

Prices for transcription and translation are per minute of audio. Text-to-Speech prices are per 1,000 characters.

Prices for transcription and translation are per minute of audio. Text-to-Speech prices are per 1,000 characters.

Frequently asked questions

How does billing work?

You're billed based on your usage. For chat models, you're charged per token for both input and output. For image, video, and audio models, pricing varies based on the specific task and output quality.

Are there any minimum commitments?

No, there are no minimum commitments. You only pay for what you use, and you can start with $1 in free credits.

Can I set spending limits?

Yes, you can set monthly spending limits in your account dashboard to control costs and prevent unexpected charges.

Do you offer volume discounts?

Yes, we offer volume discounts for high-usage customers. If your usage is substantial, please contact our sales team who can create a custom pricing plan tailored to your needs.

How do I get started?

Sign up for an account, get your API key, and start using our models right away. We provide comprehensive documentation and code examples to help you integrate quickly.

Ready to accelerate your AI development?

Ready to accelerate your AI development?

© 2025 SiliconFlow Technology PTE. LTD.

© 2025 SiliconFlow Technology PTE. LTD.

© 2025 SiliconFlow Technology PTE. LTD.