Transparent Pricing

High-performance inference at competitive prices. Pay only for what you use with no hidden fees or commitments.

Transparent Pricing

High-performance inference at competitive prices. Pay only for what you use with no hidden fees or commitments.

Begin exploring our models and APIs with no commitment required.

Start for free

Begin exploring our models and APIs with no commitment required.

Start for free

Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.

Simple integration

Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.

Simple integration

Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.

Simple integration

Only pay for what you use. Set spending limits and monitor usage through our dashboard.

Pay-as-you-go

Only pay for what you use. Set spending limits and monitor usage through our dashboard.

Pay-as-you-go

Only pay for what you use. Set spending limits and monitor usage through our dashboard.

Pay-as-you-go

Serverless Pricing

Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!

Serverless Pricing

Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!

Serverless Pricing

Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!

LLM

High-performance language models for conversational AI applications, with competitive per-token pricing.

Model Name

Context Length

Context Length

Input (/M Tokens)

Output (/M Tokens)

ERNIE-4.5-300B-A47B

131K

$

0.28

$

1.1

Model Name

ERNIE-4.5-300B-A47B

131K

Context Length

Input (/M Tokens)

$

1.1

$

1.1

Output (/M Tokens)

ERNIE-4.5-300B-A47B

131K

$

0.28

$

1.1

Seed-OSS-36B-Instruct

262K

$

0.21

$

0.57

Model Name

Seed-OSS-36B-Instruct

262K

Context Length

Input (/M Tokens)

$

0.57

$

0.57

Output (/M Tokens)

Seed-OSS-36B-Instruct

262K

$

0.21

$

0.57

DeepSeek-R1

164K

$

0.5

$

2.18

Model Name

DeepSeek-R1

164K

Context Length

Input (/M Tokens)

$

2.18

$

2.18

Output (/M Tokens)

DeepSeek-R1

164K

$

0.5

$

2.18

DeepSeek-R1-Distill-Qwen-14B

131K

$

0.1

$

0.1

Model Name

DeepSeek-R1-Distill-Qwen-14B

131K

Context Length

Input (/M Tokens)

$

0.1

$

0.1

Output (/M Tokens)

DeepSeek-R1-Distill-Qwen-14B

131K

$

0.1

$

0.1

DeepSeek-R1-Distill-Qwen-32B

131K

$

0.18

$

0.18

Model Name

DeepSeek-R1-Distill-Qwen-32B

131K

Context Length

Input (/M Tokens)

$

0.18

$

0.18

Output (/M Tokens)

DeepSeek-R1-Distill-Qwen-32B

131K

$

0.18

$

0.18

DeepSeek-R1-Distill-Qwen-7B

33K

$

0.05

$

0.05

Model Name

DeepSeek-R1-Distill-Qwen-7B

33K

Context Length

Input (/M Tokens)

$

0.05

$

0.05

Output (/M Tokens)

DeepSeek-R1-Distill-Qwen-7B

33K

$

0.05

$

0.05

DeepSeek-V3

164K

$

0.27

$

1.13

Model Name

DeepSeek-V3

164K

Context Length

Input (/M Tokens)

$

1.13

$

1.13

Output (/M Tokens)

DeepSeek-V3

164K

$

0.27

$

1.13

DeepSeek-V3.1

164K

$

0.27

$

1.1

Model Name

DeepSeek-V3.1

164K

Context Length

Input (/M Tokens)

$

1.1

$

1.1

Output (/M Tokens)

DeepSeek-V3.1

164K

$

0.27

$

1.1

DeepSeek-VL2

4K

$

0.15

$

0.15

Model Name

DeepSeek-VL2

4K

Context Length

Input (/M Tokens)

$

0.15

$

0.15

Output (/M Tokens)

DeepSeek-VL2

4K

$

0.15

$

0.15

Ling-mini-2.0

131K

$

0.07

$

0.29

Model Name

Ling-mini-2.0

131K

Context Length

Input (/M Tokens)

$

0.29

$

0.29

Output (/M Tokens)

Ling-mini-2.0

131K

$

0.07

$

0.29

Meta-Llama-3.1-8B-Instruct

33K

$

0.06

$

0.06

Model Name

Meta-Llama-3.1-8B-Instruct

33K

Context Length

Input (/M Tokens)

$

0.06

$

0.06

Output (/M Tokens)

Meta-Llama-3.1-8B-Instruct

33K

$

0.06

$

0.06

MiniMax-M1-80k

131K

$

0.55

$

2.2

Model Name

MiniMax-M1-80k

131K

Context Length

Input (/M Tokens)

$

2.2

$

2.2

Output (/M Tokens)

MiniMax-M1-80k

131K

$

0.55

$

2.2

Kimi-Dev-72B

131K

$

0.29

$

1.15

Model Name

Kimi-Dev-72B

131K

Context Length

Input (/M Tokens)

$

1.15

$

1.15

Output (/M Tokens)

Kimi-Dev-72B

131K

$

0.29

$

1.15

Kimi-K2-Instruct

131K

$

0.58

$

2.29

Model Name

Kimi-K2-Instruct

131K

Context Length

Input (/M Tokens)

$

2.29

$

2.29

Output (/M Tokens)

Kimi-K2-Instruct

131K

$

0.58

$

2.29

Kimi-K2-Instruct-0905

262K

$

0.58

$

2.29

Model Name

Kimi-K2-Instruct-0905

262K

Context Length

Input (/M Tokens)

$

2.29

$

2.29

Output (/M Tokens)

Kimi-K2-Instruct-0905

262K

$

0.58

$

2.29

gpt-oss-120b

131K

$

0.09

$

0.45

Model Name

gpt-oss-120b

131K

Context Length

Input (/M Tokens)

$

0.45

$

0.45

Output (/M Tokens)

gpt-oss-120b

131K

$

0.09

$

0.45

gpt-oss-20b

131K

$

0.04

$

0.18

Model Name

gpt-oss-20b

131K

Context Length

Input (/M Tokens)

$

0.18

$

0.18

Output (/M Tokens)

gpt-oss-20b

131K

$

0.04

$

0.18

Qwen2.5-14B-Instruct

33K

$

0.1

$

0.1

Model Name

Qwen2.5-14B-Instruct

33K

Context Length

Input (/M Tokens)

$

0.1

$

0.1

Output (/M Tokens)

Qwen2.5-14B-Instruct

33K

$

0.1

$

0.1

Qwen2.5-32B-Instruct

33K

$

0.18

$

0.18

Model Name

Qwen2.5-32B-Instruct

33K

Context Length

Input (/M Tokens)

$

0.18

$

0.18

Output (/M Tokens)

Qwen2.5-32B-Instruct

33K

$

0.18

$

0.18

Qwen2.5-72B-Instruct

33K

$

0.59

$

0.59

Model Name

Qwen2.5-72B-Instruct

33K

Context Length

Input (/M Tokens)

$

0.59

$

0.59

Output (/M Tokens)

Qwen2.5-72B-Instruct

33K

$

0.59

$

0.59

Qwen2.5-72B-Instruct-128K

131K

$

0.59

$

0.59

Model Name

Qwen2.5-72B-Instruct-128K

131K

Context Length

Input (/M Tokens)

$

0.59

$

0.59

Output (/M Tokens)

Qwen2.5-72B-Instruct-128K

131K

$

0.59

$

0.59

Qwen2.5-7B-Instruct

33K

$

0.05

$

0.05

Model Name

Qwen2.5-7B-Instruct

33K

Context Length

Input (/M Tokens)

$

0.05

$

0.05

Output (/M Tokens)

Qwen2.5-7B-Instruct

33K

$

0.05

$

0.05

Qwen2.5-Coder-32B-Instruct

33K

$

0.18

$

0.18

Model Name

Qwen2.5-Coder-32B-Instruct

33K

Context Length

Input (/M Tokens)

$

0.18

$

0.18

Output (/M Tokens)

Qwen2.5-Coder-32B-Instruct

33K

$

0.18

$

0.18

Qwen2.5-VL-32B-Instruct

131K

$

0.27

$

0.27

Model Name

Qwen2.5-VL-32B-Instruct

131K

Context Length

Input (/M Tokens)

$

0.27

$

0.27

Output (/M Tokens)

Qwen2.5-VL-32B-Instruct

131K

$

0.27

$

0.27

Qwen2.5-VL-72B-Instruct

131K

$

0.59

$

0.59

Model Name

Qwen2.5-VL-72B-Instruct

131K

Context Length

Input (/M Tokens)

$

0.59

$

0.59

Output (/M Tokens)

Qwen2.5-VL-72B-Instruct

131K

$

0.59

$

0.59

Qwen2.5-VL-7B-Instruct

33K

$

0.05

$

0.05

Model Name

Qwen2.5-VL-7B-Instruct

33K

Context Length

Input (/M Tokens)

$

0.05

$

0.05

Output (/M Tokens)

Qwen2.5-VL-7B-Instruct

33K

$

0.05

$

0.05

Qwen3-14B

131K

$

0.07

$

0.28

Model Name

Qwen3-14B

131K

Context Length

Input (/M Tokens)

$

0.28

$

0.28

Output (/M Tokens)

Qwen3-14B

131K

$

0.07

$

0.28

Qwen3-235B-A22B

131K

$

0.35

$

1.42

Model Name

Qwen3-235B-A22B

131K

Context Length

Input (/M Tokens)

$

1.42

$

1.42

Output (/M Tokens)

Qwen3-235B-A22B

131K

$

0.35

$

1.42

Qwen3-235B-A22B-2507

262K

$

0.35

$

1.42

Model Name

Qwen3-235B-A22B-2507

262K

Context Length

Input (/M Tokens)

$

1.42

$

1.42

Output (/M Tokens)

Qwen3-235B-A22B-2507

262K

$

0.35

$

1.42

Qwen3-235B-A22B-Thinking-2507

262K

$

0.35

$

1.42

Model Name

Qwen3-235B-A22B-Thinking-2507

262K

Context Length

Input (/M Tokens)

$

1.42

$

1.42

Output (/M Tokens)

Qwen3-235B-A22B-Thinking-2507

262K

$

0.35

$

1.42

Qwen3-30B-A3B

131K

$

0.09

$

0.45

Model Name

Qwen3-30B-A3B

131K

Context Length

Input (/M Tokens)

$

0.45

$

0.45

Output (/M Tokens)

Qwen3-30B-A3B

131K

$

0.09

$

0.45

Qwen3-30B-A3B-Instruct-2507

262K

$

0.1

$

0.4

Model Name

Qwen3-30B-A3B-Instruct-2507

262K

Context Length

Input (/M Tokens)

$

0.4

$

0.4

Output (/M Tokens)

Qwen3-30B-A3B-Instruct-2507

262K

$

0.1

$

0.4

Qwen3-30B-A3B-Thinking-2507

262K

$

0.1

$

0.4

Model Name

Qwen3-30B-A3B-Thinking-2507

262K

Context Length

Input (/M Tokens)

$

0.4

$

0.4

Output (/M Tokens)

Qwen3-30B-A3B-Thinking-2507

262K

$

0.1

$

0.4

Qwen3-32B

131K

$

0.14

$

0.57

Model Name

Qwen3-32B

131K

Context Length

Input (/M Tokens)

$

0.57

$

0.57

Output (/M Tokens)

Qwen3-32B

131K

$

0.14

$

0.57

Qwen3-8B

131K

$

0.06

$

0.06

Model Name

Qwen3-8B

131K

Context Length

Input (/M Tokens)

$

0.06

$

0.06

Output (/M Tokens)

Qwen3-8B

131K

$

0.06

$

0.06

Qwen3-Coder-30B-A3B-Instruct

262K

$

0.1

$

0.4

Model Name

Qwen3-Coder-30B-A3B-Instruct

262K

Context Length

Input (/M Tokens)

$

0.4

$

0.4

Output (/M Tokens)

Qwen3-Coder-30B-A3B-Instruct

262K

$

0.1

$

0.4

Qwen3-Coder-480B-A35B

262K

$

1.14

$

2.28

Model Name

Qwen3-Coder-480B-A35B

262K

Context Length

Input (/M Tokens)

$

2.28

$

2.28

Output (/M Tokens)

Qwen3-Coder-480B-A35B

262K

$

1.14

$

2.28

Qwen3-Embedding-0.6B

33K

$

0.01

$

0

Model Name

Qwen3-Embedding-0.6B

33K

Context Length

Input (/M Tokens)

$

0.01

$

0.01

Output (/M Tokens)

Qwen3-Embedding-0.6B

33K

$

0.01

$

0

Qwen3-Embedding-4B

33K

$

0.02

$

0

Model Name

Qwen3-Embedding-4B

33K

Context Length

Input (/M Tokens)

$

0.02

$

0.02

Output (/M Tokens)

Qwen3-Embedding-4B

33K

$

0.02

$

0

Qwen3-Embedding-8B

33K

$

0.04

$

0

Model Name

Qwen3-Embedding-8B

33K

Context Length

Input (/M Tokens)

$

0.04

$

0.04

Output (/M Tokens)

Qwen3-Embedding-8B

33K

$

0.04

$

0

Qwen3-Reranker-0.6B

33K

$

0.01

$

0

Model Name

Qwen3-Reranker-0.6B

33K

Context Length

Input (/M Tokens)

$

0.01

$

0.01

Output (/M Tokens)

Qwen3-Reranker-0.6B

33K

$

0.01

$

0

Qwen3-Reranker-4B

33K

$

0.02

$

0

Model Name

Qwen3-Reranker-4B

33K

Context Length

Input (/M Tokens)

$

0.02

$

0.02

Output (/M Tokens)

Qwen3-Reranker-4B

33K

$

0.02

$

0

Qwen3-Reranker-8B

33K

$

0.04

$

0

Model Name

Qwen3-Reranker-8B

33K

Context Length

Input (/M Tokens)

$

0.04

$

0.04

Output (/M Tokens)

Qwen3-Reranker-8B

33K

$

0.04

$

0

QwQ-32B

131K

$

0.15

$

0.58

Model Name

QwQ-32B

131K

Context Length

Input (/M Tokens)

$

0.58

$

0.58

Output (/M Tokens)

QwQ-32B

131K

$

0.15

$

0.58

step3

66K

$

0.57

$

1.42

Model Name

step3

66K

Context Length

Input (/M Tokens)

$

1.42

$

1.42

Output (/M Tokens)

step3

66K

$

0.57

$

1.42

Hunyuan-A13B-Instruct

131K

$

0.14

$

0.57

Model Name

Hunyuan-A13B-Instruct

131K

Context Length

Input (/M Tokens)

$

0.57

$

0.57

Output (/M Tokens)

Hunyuan-A13B-Instruct

131K

$

0.14

$

0.57

GLM-4-32B-0414

33K

$

0.27

$

0.27

Model Name

GLM-4-32B-0414

33K

Context Length

Input (/M Tokens)

$

0.27

$

0.27

Output (/M Tokens)

GLM-4-32B-0414

33K

$

0.27

$

0.27

GLM-4-9B-0414

33K

$

0.086

$

0.086

Model Name

GLM-4-9B-0414

33K

Context Length

Input (/M Tokens)

$

0.086

$

0.086

Output (/M Tokens)

GLM-4-9B-0414

33K

$

0.086

$

0.086

GLM-4.1V-9B-Thinking

66K

$

0.035

$

0.14

Model Name

GLM-4.1V-9B-Thinking

66K

Context Length

Input (/M Tokens)

$

0.14

$

0.14

Output (/M Tokens)

GLM-4.1V-9B-Thinking

66K

$

0.035

$

0.14

GLM-Z1-32B-0414

131K

$

0.14

$

0.57

Model Name

GLM-Z1-32B-0414

131K

Context Length

Input (/M Tokens)

$

0.57

$

0.57

Output (/M Tokens)

GLM-Z1-32B-0414

131K

$

0.14

$

0.57

GLM-Z1-9B-0414

131K

$

0.086

$

0.086

Model Name

GLM-Z1-9B-0414

131K

Context Length

Input (/M Tokens)

$

0.086

$

0.086

Output (/M Tokens)

GLM-Z1-9B-0414

131K

$

0.086

$

0.086

GLM-4.5

131K

$

0.5

$

2

Model Name

GLM-4.5

131K

Context Length

Input (/M Tokens)

$

2

$

2

Output (/M Tokens)

GLM-4.5

131K

$

0.5

$

2

GLM-4.5-Air

131K

$

0.14

$

0.86

Model Name

GLM-4.5-Air

131K

Context Length

Input (/M Tokens)

$

0.86

$

0.86

Output (/M Tokens)

GLM-4.5-Air

131K

$

0.14

$

0.86

GLM-4.5V

66K

$

0.14

$

0.86

Model Name

GLM-4.5V

66K

Context Length

Input (/M Tokens)

$

0.86

$

0.86

Output (/M Tokens)

GLM-4.5V

66K

$

0.14

$

0.86

Prices shown are per 1 million tokens.

Prices shown are per 1 million tokens.

Prices shown are per 1 million tokens.

Audio Models

Process and generate audio with our high-quality speech recognition and synthesis models.

Model Name

Output (/M UTF-8 bytes)

Prices for transcription and translation are per minute of audio. Text-to-Speech prices are per 1,000 characters.

Prices for transcription and translation are per minute of audio. Text-to-Speech prices are per 1,000 characters.

Prices for transcription and translation are per minute of audio. Text-to-Speech prices are per 1,000 characters.

Frequently asked questions

How does billing work?

You're billed based on your usage. For chat models, you're charged per token for both input and output. For image, video, and audio models, pricing varies based on the specific task and output quality.

Are there any minimum commitments?

No, there are no minimum commitments. You only pay for what you use, and you can start with $1 in free credits.

Can I set spending limits?

Yes, you can set monthly spending limits in your account dashboard to control costs and prevent unexpected charges.

Do you offer volume discounts?

Yes, we offer volume discounts for high-usage customers. If your usage is substantial, please contact our sales team who can create a custom pricing plan tailored to your needs.

How do I get started?

Sign up for an account, get your API key, and start using our models right away. We provide comprehensive documentation and code examples to help you integrate quickly.

Ready to accelerate your AI development?

Ready to accelerate your AI development?

© 2025 SiliconFlow Technology PTE. LTD.

© 2025 SiliconFlow Technology PTE. LTD.

© 2025 SiliconFlow Technology PTE. LTD.