Qwen3-Next-80B-A3B-Instruct API, Deployment, Pricing

Qwen/Qwen3-Next-80B-A3B-Instruct

Qwen3-Next-80B-A3B-Instruct is a next-generation foundation model released by Alibaba's Qwen team. It is built on the new Qwen3-Next architecture, designed for ultimate training and inference efficiency. The model incorporates innovative features such as a Hybrid Attention mechanism (Gated DeltaNet and Gated Attention), a High-Sparsity Mixture-of-Experts (MoE) structure, and various stability optimizations. As an 80-billion-parameter sparse model, it activates only about 3 billion parameters per token during inference, which significantly reduces computational costs and delivers over 10 times higher throughput than the Qwen3-32B model for long-context tasks exceeding 32K tokens. This is an instruction-tuned version optimized for general-purpose tasks and does not support 'thinking' mode. In terms of performance, it is comparable to Qwen's flagship model, Qwen3-235B, on certain benchmarks, showing significant advantages in ultra-long-context scenarios

API Usage

curl --request POST \
  --url https://api.siliconflow.com/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "thinking_budget": 4096,
  "top_p": 0.7,
  "model": "Qwen/Qwen3-Next-80B-A3B-Instruct",
  "messages": [
    {
      "content": "I have 4 apples. I give 2 to my friend. How many apples do we have now?",
      "role": "user"
    }
  ]
}'

Details

Model Provider

Qwen

Type

text

Sub Type

chat

Size

80B

Publish Time

Sep 18, 2025

Input Price

$

0.14

/ M Tokens

Output Price

$

1.4

/ M Tokens

Context length

262K

Tags

MoE,80B,262K

Compare with Other Models

See how this model stacks up against others.

Ready to accelerate your AI development?

Ready to accelerate your AI development?

© 2025 SiliconFlow Technology PTE. LTD.

© 2025 SiliconFlow Technology PTE. LTD.

© 2025 SiliconFlow Technology PTE. LTD.