Models

Products

Pricing

Docs

Blog

About

Contact

🎉 Ring-1T Now on SiliconFlow: The World's First Open-Source Trillion-Parameter Thinking Model

Transparent Pricing

High-performance inference at competitive prices. Pay only for what you use with no hidden fees or commitments.

Get Started

Contact Sales

Transparent Pricing

High-performance inference at competitive prices. Pay only for what you use with no hidden fees or commitments.

Get Started

Contact Sales

Begin exploring our models and APIs with no commitment required.

Start for free

Begin exploring our models and APIs with no commitment required.

Start for free

Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.

Simple integration

Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.

Simple integration

Switch to SiliconFlow with a single line of code. Our APIs are compatible with OpenAI standards.

Simple integration

Only pay for what you use. Set spending limits and monitor usage through our dashboard.

Pay-as-you-go

Only pay for what you use. Set spending limits and monitor usage through our dashboard.

Pay-as-you-go

Only pay for what you use. Set spending limits and monitor usage through our dashboard.

Pay-as-you-go

Serverless Pricing

Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!

Serverless Pricing

Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!

Serverless Pricing

Flexible token pricing, high usage limits, and postpaid billing—plus $1 in free credits to get you started!

Image Generation

Generate high-quality images from text prompts with our state-of-the-art image generation models.

Model Name

Price (/image)

FLUX 1.1 [pro]

0.04

Model Name

FLUX 1.1 [pro]

0.04

Price (

)

/ Image

FLUX 1.1 [pro]

0.04

FLUX 1.1 [pro] Ultra

0.06

Model Name

FLUX 1.1 [pro] Ultra

0.06

Price (

)

/ Image

FLUX 1.1 [pro] Ultra

0.06

FLUX.1 Kontext [max]

0.08

Model Name

FLUX.1 Kontext [max]

0.08

Price (

)

/ Image

FLUX.1 Kontext [max]

0.08

FLUX.1 Kontext [pro]

0.04

Model Name

FLUX.1 Kontext [pro]

0.04

Price (

)

/ Image

FLUX.1 Kontext [pro]

0.04

FLUX.1-dev

0.014

Model Name

FLUX.1-dev

0.014

Price (

)

/ Image

FLUX.1-dev

0.014

FLUX.1-Kontext-dev

0.015

Model Name

FLUX.1-Kontext-dev

0.015

Price (

)

/ Image

FLUX.1-Kontext-dev

0.015

FLUX.1-schnell

0.0014

Model Name

FLUX.1-schnell

0.0014

Price (

)

/ Image

FLUX.1-schnell

0.0014

Qwen-Image

0.042

Model Name

Qwen-Image

0.042

Price (

)

/ Image

Qwen-Image

0.042

Qwen-Image-Edit

0.04

Model Name

Qwen-Image-Edit

0.04

Price (

)

/ Image

Qwen-Image-Edit

0.04

Prices shown are per image generated or edited.

Video Generation

Create dynamic videos from text descriptions with our cutting-edge video generation models.

Model Name

Price (/video)

Wan2.2-I2V-A14B

0.29

Wan2.2-I2V-A14B

0.29

Price (

)

/ Video

Wan2.2-I2V-A14B

0.29

Wan2.2-T2V-A14B

0.29

Wan2.2-T2V-A14B

0.29

Price (

)

/ Video

Wan2.2-T2V-A14B

0.29

Prices shown are per video generated.

LLM

High-performance language models for conversational AI applications, with competitive per-token pricing.

Model Name

Context Length

Input (/M Tokens)

Output (/M Tokens)

DeepSeek-R1

164K

0.5

2.18

Model Name

DeepSeek-R1

164K

Context Length

Input (/M Tokens)

DeepSeek-R1

Output (/M Tokens)

DeepSeek-R1

164K

0.5

2.18

DeepSeek-R1-Distill-Qwen-14B

131K

0.1

Model Name

DeepSeek-R1-Distill-Qwen-14B

131K

Context Length

Input (/M Tokens)

DeepSeek-R1-Distill-Qwen-14B

Output (/M Tokens)

DeepSeek-R1-Distill-Qwen-14B

131K

0.1

DeepSeek-R1-Distill-Qwen-32B

131K

0.18

Model Name

DeepSeek-R1-Distill-Qwen-32B

131K

Context Length

Input (/M Tokens)

DeepSeek-R1-Distill-Qwen-32B

Output (/M Tokens)

DeepSeek-R1-Distill-Qwen-32B

131K

0.18

DeepSeek-R1-Distill-Qwen-7B

33K

0.05

Model Name

DeepSeek-R1-Distill-Qwen-7B

33K

Context Length

Input (/M Tokens)

DeepSeek-R1-Distill-Qwen-7B

Output (/M Tokens)

DeepSeek-R1-Distill-Qwen-7B

33K

0.05

DeepSeek-V3

164K

0.25

1.0

Model Name

DeepSeek-V3

164K

Context Length

Input (/M Tokens)

DeepSeek-V3

Output (/M Tokens)

DeepSeek-V3

164K

0.25

1.0

DeepSeek-V3.1

164K

0.27

1.0

Model Name

DeepSeek-V3.1

164K

Context Length

Input (/M Tokens)

DeepSeek-V3.1

Output (/M Tokens)

DeepSeek-V3.1

164K

0.27

1.0

DeepSeek-V3.1-Terminus

164K

0.27

1.0

Model Name

DeepSeek-V3.1-Terminus

164K

Context Length

Input (/M Tokens)

DeepSeek-V3.1-Terminus

Output (/M Tokens)

DeepSeek-V3.1-Terminus

164K

0.27

1.0

DeepSeek-V3.2-Exp

164K

0.27

0.41

Model Name

DeepSeek-V3.2-Exp

164K

Context Length

Input (/M Tokens)

DeepSeek-V3.2-Exp

Output (/M Tokens)

DeepSeek-V3.2-Exp

164K

0.27

0.41

DeepSeek-VL2

0.15

Model Name

DeepSeek-VL2

Context Length

Input (/M Tokens)

DeepSeek-VL2

Output (/M Tokens)

DeepSeek-VL2

0.15

ERNIE-4.5-300B-A47B

131K

0.28

1.1

Model Name

ERNIE-4.5-300B-A47B

131K

Context Length

Input (/M Tokens)

ERNIE-4.5-300B-A47B

Output (/M Tokens)

ERNIE-4.5-300B-A47B

131K

0.28

1.1

GLM-4-32B-0414

33K

0.27

Model Name

GLM-4-32B-0414

33K

Context Length

Input (/M Tokens)

GLM-4-32B-0414

Output (/M Tokens)

GLM-4-32B-0414

33K

0.27

GLM-4-9B-0414

33K

0.086

Model Name

GLM-4-9B-0414

33K

Context Length

Input (/M Tokens)

GLM-4-9B-0414

Output (/M Tokens)

GLM-4-9B-0414

33K

0.086

GLM-4.1V-9B-Thinking

66K

0.035

0.14

Model Name

GLM-4.1V-9B-Thinking

66K

Context Length

Input (/M Tokens)

GLM-4.1V-9B-Thinking

Output (/M Tokens)

GLM-4.1V-9B-Thinking

66K

0.035

0.14

GLM-4.5

131K

0.4

2.0

Model Name

GLM-4.5

131K

Context Length

Input (/M Tokens)

GLM-4.5

Output (/M Tokens)

GLM-4.5

131K

0.4

2.0

GLM-4.5-Air

131K

0.14

0.86

Model Name

GLM-4.5-Air

131K

Context Length

Input (/M Tokens)

GLM-4.5-Air

Output (/M Tokens)

GLM-4.5-Air

131K

0.14

0.86

GLM-4.5V

66K

0.14

0.86

Model Name

GLM-4.5V

66K

Context Length

Input (/M Tokens)

GLM-4.5V

Output (/M Tokens)

GLM-4.5V

66K

0.14

0.86

GLM-4.6

205K

0.5

1.9

Model Name

GLM-4.6

205K

Context Length

Input (/M Tokens)

GLM-4.6

Output (/M Tokens)

GLM-4.6

205K

0.5

1.9

GLM-Z1-32B-0414

131K

0.14

0.57

Model Name

GLM-Z1-32B-0414

131K

Context Length

Input (/M Tokens)

GLM-Z1-32B-0414

Output (/M Tokens)

GLM-Z1-32B-0414

131K

0.14

0.57

GLM-Z1-9B-0414

131K

0.086

Model Name

GLM-Z1-9B-0414

131K

Context Length

Input (/M Tokens)

GLM-Z1-9B-0414

Output (/M Tokens)

GLM-Z1-9B-0414

131K

0.086

gpt-oss-120b

131K

0.05

0.45

Model Name

gpt-oss-120b

131K

Context Length

Input (/M Tokens)

gpt-oss-120b

Output (/M Tokens)

gpt-oss-120b

131K

0.05

0.45

gpt-oss-20b

131K

0.04

0.18

Model Name

gpt-oss-20b

131K

Context Length

Input (/M Tokens)

gpt-oss-20b

Output (/M Tokens)

gpt-oss-20b

131K

0.04

0.18

Hunyuan-A13B-Instruct

131K

0.14

0.57

Model Name

Hunyuan-A13B-Instruct

131K

Context Length

Input (/M Tokens)

Hunyuan-A13B-Instruct

Output (/M Tokens)

Hunyuan-A13B-Instruct

131K

0.14

0.57

Hunyuan-MT-7B

33K

0.0

Model Name

Hunyuan-MT-7B

33K

Context Length

Input (/M Tokens)

Hunyuan-MT-7B

Output (/M Tokens)

Hunyuan-MT-7B

33K

0.0

Kimi-Dev-72B

131K

0.29

1.15

Model Name

Kimi-Dev-72B

131K

Context Length

Input (/M Tokens)

Kimi-Dev-72B

Output (/M Tokens)

Kimi-Dev-72B

131K

0.29

1.15

Kimi-K2-Instruct

131K

0.58

2.29

Model Name

Kimi-K2-Instruct

131K

Context Length

Input (/M Tokens)

Kimi-K2-Instruct

Output (/M Tokens)

Kimi-K2-Instruct

131K

0.58

2.29

Kimi-K2-Instruct-0905

262K

0.4

2.0

Model Name

Kimi-K2-Instruct-0905

262K

Context Length

Input (/M Tokens)

Kimi-K2-Instruct-0905

Output (/M Tokens)

Kimi-K2-Instruct-0905

262K

0.4

2.0

Ling-1T

131K

0.57

2.28

Model Name

Ling-1T

131K

Context Length

Input (/M Tokens)

Ling-1T

Output (/M Tokens)

Ling-1T

131K

0.57

2.28

Ling-flash-2.0

131K

0.14

0.57

Model Name

Ling-flash-2.0

131K

Context Length

Input (/M Tokens)

Ling-flash-2.0

Output (/M Tokens)

Ling-flash-2.0

131K

0.14

0.57

Ling-mini-2.0

131K

0.07

0.28

Model Name

Ling-mini-2.0

131K

Context Length

Input (/M Tokens)

Ling-mini-2.0

Output (/M Tokens)

Ling-mini-2.0

131K

0.07

0.28

Meta-Llama-3.1-8B-Instruct

33K

0.06

Model Name

Meta-Llama-3.1-8B-Instruct

33K

Context Length

Input (/M Tokens)

Meta-Llama-3.1-8B-Instruct

Output (/M Tokens)

Meta-Llama-3.1-8B-Instruct

33K

0.06

MiniMax-M1-80k

131K

0.55

2.2

Model Name

MiniMax-M1-80k

131K

Context Length

Input (/M Tokens)

MiniMax-M1-80k

Output (/M Tokens)

MiniMax-M1-80k

131K

0.55

2.2

Qwen2.5-14B-Instruct

33K

0.1

Model Name

Qwen2.5-14B-Instruct

33K

Context Length

Input (/M Tokens)

Qwen2.5-14B-Instruct

Output (/M Tokens)

Qwen2.5-14B-Instruct

33K

0.1

Qwen2.5-32B-Instruct

33K

0.18

Model Name

Qwen2.5-32B-Instruct

33K

Context Length

Input (/M Tokens)

Qwen2.5-32B-Instruct

Output (/M Tokens)

Qwen2.5-32B-Instruct

33K

0.18

Qwen2.5-72B-Instruct

33K

0.59

Model Name

Qwen2.5-72B-Instruct

33K

Context Length

Input (/M Tokens)

Qwen2.5-72B-Instruct

Output (/M Tokens)

Qwen2.5-72B-Instruct

33K

0.59

Qwen2.5-72B-Instruct-128K

131K

0.59

Model Name

Qwen2.5-72B-Instruct-128K

131K

Context Length

Input (/M Tokens)

Qwen2.5-72B-Instruct-128K

Output (/M Tokens)

Qwen2.5-72B-Instruct-128K

131K

0.59

Qwen2.5-7B-Instruct

33K

0.05

Model Name

Qwen2.5-7B-Instruct

33K

Context Length

Input (/M Tokens)

Qwen2.5-7B-Instruct

Output (/M Tokens)

Qwen2.5-7B-Instruct

33K

0.05

Qwen2.5-Coder-32B-Instruct

33K

0.18

Model Name

Qwen2.5-Coder-32B-Instruct

33K

Context Length

Input (/M Tokens)

Qwen2.5-Coder-32B-Instruct

Output (/M Tokens)

Qwen2.5-Coder-32B-Instruct

33K

0.18

Qwen2.5-VL-32B-Instruct

131K

0.27

Model Name

Qwen2.5-VL-32B-Instruct

131K

Context Length

Input (/M Tokens)

Qwen2.5-VL-32B-Instruct

Output (/M Tokens)

Qwen2.5-VL-32B-Instruct

131K

0.27

Qwen2.5-VL-72B-Instruct

131K

0.59

Model Name

Qwen2.5-VL-72B-Instruct

131K

Context Length

Input (/M Tokens)

Qwen2.5-VL-72B-Instruct

Output (/M Tokens)

Qwen2.5-VL-72B-Instruct

131K

0.59

Qwen2.5-VL-7B-Instruct

33K

0.05

Model Name

Qwen2.5-VL-7B-Instruct

33K

Context Length

Input (/M Tokens)

Qwen2.5-VL-7B-Instruct

Output (/M Tokens)

Qwen2.5-VL-7B-Instruct

33K

0.05

Qwen3-14B

131K

0.07

0.28

Model Name

Qwen3-14B

131K

Context Length

Input (/M Tokens)

Qwen3-14B

Output (/M Tokens)

Qwen3-14B

131K

0.07

0.28

Qwen3-235B-A22B

131K

0.35

1.42

Model Name

Qwen3-235B-A22B

131K

Context Length

Input (/M Tokens)

Qwen3-235B-A22B

Output (/M Tokens)

Qwen3-235B-A22B

131K

0.35

1.42

Qwen3-235B-A22B-Instruct-2507

262K

0.09

0.6

Model Name

Qwen3-235B-A22B-Instruct-2507

262K

Context Length

Input (/M Tokens)

Qwen3-235B-A22B-Instruct-2507

Output (/M Tokens)

Qwen3-235B-A22B-Instruct-2507

262K

0.09

0.6

Qwen3-235B-A22B-Thinking-2507

262K

0.13

0.6

Model Name

Qwen3-235B-A22B-Thinking-2507

262K

Context Length

Input (/M Tokens)

Qwen3-235B-A22B-Thinking-2507

Output (/M Tokens)

Qwen3-235B-A22B-Thinking-2507

262K

0.13

0.6

Qwen3-30B-A3B

131K

0.09

0.45

Model Name

Qwen3-30B-A3B

131K

Context Length

Input (/M Tokens)

Qwen3-30B-A3B

Output (/M Tokens)

Qwen3-30B-A3B

131K

0.09

0.45

Qwen3-30B-A3B-Instruct-2507

262K

0.09

0.3

Model Name

Qwen3-30B-A3B-Instruct-2507

262K

Context Length

Input (/M Tokens)

Qwen3-30B-A3B-Instruct-2507

Output (/M Tokens)

Qwen3-30B-A3B-Instruct-2507

262K

0.09

0.3

Qwen3-30B-A3B-Thinking-2507

262K

0.09

0.3

Model Name

Qwen3-30B-A3B-Thinking-2507

262K

Context Length

Input (/M Tokens)

Qwen3-30B-A3B-Thinking-2507

Output (/M Tokens)

Qwen3-30B-A3B-Thinking-2507

262K

0.09

0.3

Qwen3-32B

131K

0.14

0.57

Model Name

Qwen3-32B

131K

Context Length

Input (/M Tokens)

Qwen3-32B

Output (/M Tokens)

Qwen3-32B

131K

0.14

0.57

Qwen3-8B

131K

0.06

Model Name

Qwen3-8B

131K

Context Length

Input (/M Tokens)

Qwen3-8B

Output (/M Tokens)

Qwen3-8B

131K

0.06

Qwen3-Coder-30B-A3B-Instruct

262K

0.07

0.28

Model Name

Qwen3-Coder-30B-A3B-Instruct

262K

Context Length

Input (/M Tokens)

Qwen3-Coder-30B-A3B-Instruct

Output (/M Tokens)

Qwen3-Coder-30B-A3B-Instruct

262K

0.07

0.28

Qwen3-Coder-480B-A35B

262K

0.25

1.0

Model Name

Qwen3-Coder-480B-A35B

262K

Context Length

Input (/M Tokens)

Qwen3-Coder-480B-A35B

Output (/M Tokens)

Qwen3-Coder-480B-A35B

262K

0.25

1.0

Qwen3-Next-80B-A3B-Instruct

262K

0.14

1.4

Model Name

Qwen3-Next-80B-A3B-Instruct

262K

Context Length

Input (/M Tokens)

Qwen3-Next-80B-A3B-Instruct

Output (/M Tokens)

Qwen3-Next-80B-A3B-Instruct

262K

0.14

1.4

Qwen3-Next-80B-A3B-Thinking

262K

0.14

0.57

Model Name

Qwen3-Next-80B-A3B-Thinking

262K

Context Length

Input (/M Tokens)

Qwen3-Next-80B-A3B-Thinking

Output (/M Tokens)

Qwen3-Next-80B-A3B-Thinking

262K

0.14

0.57

Qwen3-Omni-30B-A3B-Captioner

66K

0.1

0.4

Model Name

Qwen3-Omni-30B-A3B-Captioner

66K

Context Length

Input (/M Tokens)

Qwen3-Omni-30B-A3B-Captioner

Output (/M Tokens)

Qwen3-Omni-30B-A3B-Captioner

66K

0.1

0.4

Qwen3-Omni-30B-A3B-Instruct

66K

0.1

0.4

Model Name

Qwen3-Omni-30B-A3B-Instruct

66K

Context Length

Input (/M Tokens)

Qwen3-Omni-30B-A3B-Instruct

Output (/M Tokens)

Qwen3-Omni-30B-A3B-Instruct

66K

0.1

0.4

Qwen3-Omni-30B-A3B-Thinking

66K

0.1

0.4

Model Name

Qwen3-Omni-30B-A3B-Thinking

66K

Context Length

Input (/M Tokens)

Qwen3-Omni-30B-A3B-Thinking

Output (/M Tokens)

Qwen3-Omni-30B-A3B-Thinking

66K

0.1

0.4

Qwen3-VL-235B-A22B-Instruct

262K

0.3

1.5

Model Name

Qwen3-VL-235B-A22B-Instruct

262K

Context Length

Input (/M Tokens)

Qwen3-VL-235B-A22B-Instruct

Output (/M Tokens)

Qwen3-VL-235B-A22B-Instruct

262K

0.3

1.5

Qwen3-VL-235B-A22B-Thinking

262K

0.45

3.5

Model Name

Qwen3-VL-235B-A22B-Thinking

262K

Context Length

Input (/M Tokens)

Qwen3-VL-235B-A22B-Thinking

Output (/M Tokens)

Qwen3-VL-235B-A22B-Thinking

262K

0.45

3.5

Qwen3-VL-30B-A3B-Instruct

262K

0.29

1.0

Model Name

Qwen3-VL-30B-A3B-Instruct

262K

Context Length

Input (/M Tokens)

Qwen3-VL-30B-A3B-Instruct

Output (/M Tokens)

Qwen3-VL-30B-A3B-Instruct

262K

0.29

1.0

Qwen3-VL-30B-A3B-Thinking

262K

0.29

1.0

Model Name

Qwen3-VL-30B-A3B-Thinking

262K

Context Length

Input (/M Tokens)

Qwen3-VL-30B-A3B-Thinking

Output (/M Tokens)

Qwen3-VL-30B-A3B-Thinking

262K

0.29

1.0

QwQ-32B

131K

0.15

0.58

Model Name

QwQ-32B

131K

Context Length

Input (/M Tokens)

QwQ-32B

Output (/M Tokens)

QwQ-32B

131K

0.15

0.58

Ring-1T

131K

0.57

2.28

Model Name

Ring-1T

131K

Context Length

Input (/M Tokens)

Ring-1T

Output (/M Tokens)

Ring-1T

131K

0.57

2.28

Ring-flash-2.0

131K

0.14

0.57

Model Name

Ring-flash-2.0

131K

Context Length

Input (/M Tokens)

Ring-flash-2.0

Output (/M Tokens)

Ring-flash-2.0

131K

0.14

0.57

Seed-OSS-36B-Instruct

262K

0.21

0.57

Model Name

Seed-OSS-36B-Instruct

262K

Context Length

Input (/M Tokens)

Seed-OSS-36B-Instruct

Output (/M Tokens)

Seed-OSS-36B-Instruct

262K

0.21

0.57

step3

66K

0.57

1.42

Model Name

step3

66K

Context Length

Input (/M Tokens)

step3

Output (/M Tokens)

step3

66K

0.57

1.42

Prices shown are per 1 million tokens.

Audio Models

Process and generate audio with our high-quality speech recognition and synthesis models.

Model Name

Output (/M UTF-8 bytes)

Fish-Speech-1.5

15.0

Model Name

Fish-Speech-1.5

15.0

Price (

)

/ M UTF-8 bytes

Fish-Speech-1.5

15.0

FunAudioLLM/CosyVoice2-0.5B

7.15

Model Name

FunAudioLLM/CosyVoice2-0.5B

7.15

Price (

)

/ M UTF-8 bytes

FunAudioLLM/CosyVoice2-0.5B

7.15

IndexTTS-2

7.15

Model Name

IndexTTS-2

7.15

Price (

)

/ M UTF-8 bytes

IndexTTS-2

7.15

Prices for transcription and translation are per minute of audio. Text-to-Speech prices are per 1,000 characters.

Frequently asked questions

How does billing work?

You're billed based on your usage. For chat models, you're charged per token for both input and output. For image, video, and audio models, pricing varies based on the specific task and output quality.

Are there any minimum commitments?

No, there are no minimum commitments. You only pay for what you use, and you can start with $1 in free credits.

Can I set spending limits?

Yes, you can set monthly spending limits in your account dashboard to control costs and prevent unexpected charges.

Do you offer volume discounts?

Yes, we offer volume discounts for high-usage customers. If your usage is substantial, please contact our sales team who can create a custom pricing plan tailored to your needs.

How do I get started?

Sign up for an account, get your API key, and start using our models right away. We provide comprehensive documentation and code examples to help you integrate quickly.

Ready to accelerate your AI development?