MiniMax-M1-80k Now Available on SiliconFlow

Jun 17, 2025

MiniMax-M1-80k(456B), the world’s first open-source hybrid-attention model at scale, is now available on SiliconFlow.
MiniMax-M1-80k(456B), the world’s first open-source hybrid-attention model at scale, is now available on SiliconFlow.

MiniMax-M1-80k(456B), the world’s first open-source hybrid-attention model at scale, is now available on SiliconFlow.

  • 128K context support

  • Competitively priced: $0.58/M tokens (input), $2.29/M tokens (output)

Built with a cutting-edge Mixture-of-Experts (MoE) architecture and Lightning Attention, MiniMax-M1-80k achieves state-of-the-art performance in long-context reasoning, programming tasks, and multi-step tool use.

  • Hybrid Attention + MoE Architecture M1 integrates the efficiency of Mixture-of-Experts routing with the depth of Lightning Attention, allowing it to scale while maintaining reasoning quality over long sequences.

  • Optimized for agents and tools With support for extended context and strong reasoning, M1 is ideal for applications such as autonomous agents, document analysis, and sandboxed software development.

  • Math, coding, and reasoning Benchmarks show M1 performs competitively with top-tier models in tasks requiring symbolic reasoning, structured output, and complex instruction following.


Quick Start

Try the MiniMax-M1-80k model on the SiliconFlow playground.


Quick Access to API

The following Python example demonstrates how to call the MiniMax-M1-80k model via SiliconFlow’s API endpoint. More detailed API specifications for developers.

from openai import OpenAI

url = 'https://api.ap.siliconflow.com/v1/'
api_key = 'your api_key'

client = OpenAI(
    base_url=url,
    api_key=api_key
)

# Send a request with streaming output
content = ""
reasoning_content = ""
messages = [
    {"role": "user", "content": "Who are the legendary athletes of the Olympics?"}
]
response = client.chat.completions.create(
    model="MiniMaxAI/MiniMax-M1-80k",
    messages=messages,
    stream=True,  # Enable streaming output
    max_tokens=4096,
    extra_body={
        "thinking_budget": 1024
    }
)
# Gradually receive and process the response
for chunk in response:
    if chunk.choices[0].delta.content:
        content += chunk.choices[0].delta.content
    if chunk.choices[0].delta.reasoning_content:
        reasoning_content += chunk.choices[0].delta.reasoning_content

# Round 2
messages.append({"role": "assistant", "content": content})
messages.append({'role': 'user', 'content': "Continue"})
response = client.chat.completions.create(
    model="MiniMaxAI/MiniMax-M1-80k",
    messages=messages,
    stream=True
)

MiniMax-M1-80k offers a unique balance of scale, efficiency, and reasoning power, built for developers pushing the limits of generative AI. Whether you're building long-context assistants, intelligent agents, or advanced code copilots — M1 is ready.

Now go build something extraordinary with MiniMax-M1-80k on SiliconFlow.

Ready to accelerate your AI development?

Ready to accelerate your AI development?

© 2025 SiliconFlow Technology PTE. LTD.

© 2025 SiliconFlow Technology PTE. LTD.

© 2025 SiliconFlow Technology PTE. LTD.