gpt-oss-120b API, Fine-Tuning, Deployment

openai/gpt-oss-120b

gpt-oss-120b is OpenAI’s open-weight large language model with ~117B parameters (5.1B active), using a Mixture-of-Experts (MoE) design and MXFP4 quantization to run on a single 80 GB GPU. It delivers o4-mini-level or better performance in reasoning, coding, health, and math benchmarks, with full Chain-of-Thought (CoT), tool use, and Apache 2.0-licensed commercial deployment support.

API Usage

curl --request POST \
  --url https://api.siliconflow.com/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "openai/gpt-oss-20b",
  "max_tokens": 512,
  "enable_thinking": true,
  "thinking_budget": 4096,
  "min_p": 0.05,
  "temperature": 0.7,
  "top_p": 0.7,
  "top_k": 50,
  "frequency_penalty": 0.5,
  "n": 1,
  "messages": [
    {
      "content": "how are you today",
      "role": "user"
    }
  ]
}'

Details

Model Provider

openai

Type

text

Sub Type

chat

Size

120B

Publish Time

Aug 13, 2025

Input Price

$

0.09

/ M Tokens

Output Price

$

0.45

/ M Tokens

Context length

131K

Tags

MoE,120B,131K

Compare with Other Models

See how this model stacks up against others.

Ready to accelerate your AI development?

Ready to accelerate your AI development?

© 2025 SiliconFlow Technology PTE. LTD.

© 2025 SiliconFlow Technology PTE. LTD.

© 2025 SiliconFlow Technology PTE. LTD.