MiniMax-M1-80k Now Available on SiliconFlow

Jun 17, 2025

MiniMax-M1-80k（456B）, the world’s first open-source hybrid-attention model at scale, is now available on SiliconFlow.

MiniMax-M1-80k（456B）, the world’s first open-source hybrid-attention model at scale, is now available on SiliconFlow.

128K context support
Competitively priced: $0.58/M tokens (input), $2.29/M tokens (output)

Built with a cutting-edge Mixture-of-Experts (MoE) architecture and Lightning Attention, MiniMax-M1-80k achieves state-of-the-art performance in long-context reasoning, programming tasks, and multi-step tool use.

Hybrid Attention + MoE Architecture M1 integrates the efficiency of Mixture-of-Experts routing with the depth of Lightning Attention, allowing it to scale while maintaining reasoning quality over long sequences.
Optimized for agents and tools With support for extended context and strong reasoning, M1 is ideal for applications such as autonomous agents, document analysis, and sandboxed software development.
Math, coding, and reasoning Benchmarks show M1 performs competitively with top-tier models in tasks requiring symbolic reasoning, structured output, and complex instruction following.

Quick Start

Try the MiniMax-M1-80k model on the SiliconFlow playground.

Quick Access to API

The following Python example demonstrates how to call the MiniMax-M1-80k model via SiliconFlow’s API endpoint. More detailed API specifications for developers.

from openai import OpenAI

url = 'https://api.siliconflow.com/v1/'
api_key = 'your api_key'

client = OpenAI(
    base_url=url,
    api_key=api_key
)

# Send a request with streaming output
content = ""
reasoning_content = ""
messages = [
    {"role": "user", "content": "Who are the legendary athletes of the Olympics?"}
]
response = client.chat.completions.create(
    model="MiniMaxAI/MiniMax-M1-80k",
    messages=messages,
    stream=True,  # Enable streaming output
    max_tokens=4096,
    extra_body={
        "thinking_budget": 1024
    }
)
# Gradually receive and process the response
for chunk in response:
    if chunk.choices[0].delta.content:
        content += chunk.choices[0].delta.content
    if chunk.choices[0].delta.reasoning_content:
        reasoning_content += chunk.choices[0].delta.reasoning_content

# Round 2
messages.append({"role": "assistant", "content": content})
messages.append({'role': 'user', 'content': "Continue"})
response = client.chat.completions.create(
    model="MiniMaxAI/MiniMax-M1-80k",
    messages=messages,
    stream=True
)

MiniMax-M1-80k offers a unique balance of scale, efficiency, and reasoning power, built for developers pushing the limits of generative AI. Whether you're building long-context assistants, intelligent agents, or advanced code copilots — M1 is ready.

Now go build something extraordinary with MiniMax-M1-80k on SiliconFlow.

Ready to accelerate your AI development?