
MiniMaxAI
Text Generation
MiniMax-M3
MiniMax-M3 is MiniMax’s frontier multimodal coding and agentic model, built on the MiniMax Sparse Attention (MSA) architecture. It supports up to a 1M-token context window and accepts image and video inputs. The model is designed for code generation, agentic workflows, tool use, long-context understanding, and multi-step reasoning, showing strong performance on benchmarks such as SWE-Bench Pro, Terminal-Bench 2.1, and MCP Atlas....
總上下文:
1049K
最大輸出:
131K
輸入:
$
0.3
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
1.2
/ M Tokens

Nex AGI
Text Generation
Nex-N2-Pro
Nex-N2 is a family of thinking models with Agentic Thinking. They adaptively decide when and how deeply to reason, unifying agent cognition across coding, search, and tool use into a single coherent paradigm. Key Claims - SOTA among open models on SWE-Verified, SWE-Pro, Terminal Bench 2.0, Tau3, WildClawBench, BFCL V4 - Top-tier in agentic coding (end-to-end dev loops), deep search (BrowserComp, Wild Search, FinSearch), and real-world productivity (GDP Val) - Adaptive Thinking: auto-adjusts reasoning depth per step, 30-50% fewer thinking tokens vs always-on, with equal or better performance - Plug-and-play with Claude Code, Cursor, OpenClaw, and agentic harnesses...
總上下文:
262K
最大輸出:
256K
輸入:
$
0.0
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
0.0
/ M Tokens

Moonshot AI
Text Generation
Kimi-K2.6
Kimi K2.6 is an open-source, native multimodal agentic model by Moonshot AI, achieving open-source state-of-the-art on benchmarks including HLE with tools, SWE-Bench Pro, and BrowseComp. Built on a MoE architecture with 1T total parameters and 32B activated, the model supports a 256K-token context window and multimodal inputs (image and video) via its MoonViT vision encoder. K2.6 is optimized for agentic workloads: it sustains 4,000+ tool calls over 12+ hours of continuous execution, scales to 300 parallel sub-agents × 4,000 steps per run to produce 100+ files from a single prompt, and supports both Thinking and Instant inference modes with function calling and multi-turn Preserve Thinking...
總上下文:
262K
最大輸出:
262K
輸入:
$
0.77
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
4.0
/ M Tokens

Qwen
Text Generation
Qwen3.6-35B-A3B
Qwen3.6-35B-A3B is a large language model from Alibaba's Qwen3.6 series, featuring a Mixture of Experts (MoE) architecture with 35 billion total parameters and approximately 3 billion active parameters per inference, delivering strong performance with efficient compute utilization. The model supports both thinking and non-thinking modes, offering flexible switching between rapid response and deep reasoning...
總上下文:
262K
最大輸出:
262K
輸入:
$
0.2
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
1.6
/ M Tokens

Qwen
Text Generation
Qwen3.6-27B
Qwen3.6-27B is the first open-weight small-to-mid-sized dense model in the Qwen3.6 series, with targeted improvements for code generation, agent workflows, and real-world development tasks. Compared with Qwen3.5-27B, it delivers clear gains in frontend development, repository-level reasoning, tool use, and complex problem solving, while adding support for preserving reasoning context across turns to reduce redundant reasoning in iterative workflows. It also supports vision understanding with a native context length of 262,144 tokens...
總上下文:
262K
最大輸出:
262K
輸入:
$
0.3
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
3.2
/ M Tokens

Z.ai
Text Generation
GLM-5V-Turbo
GLM-5V-Turbo is Zhipu’s latest flagship multimodal foundation model, optimized for multimodal coding and agent capabilities. It supports up to 200K tokens of image, video, and text context, and, when integrated with frameworks such as Claude Code and OpenClaw, can handle complex long-horizon programming and assistant tasks....
總上下文:
205K
最大輸出:
131K
輸入:
$
1.2
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
4.0
/ M Tokens

Qwen
Text Generation
Qwen3.5-397B-A17B
Qwen3.5-397B-A17B is the latest vision-language model in the Qwen series, featuring a Mixture-of-Experts (MoE) architecture with 397B total parameters and 17B activated parameters. It natively supports 256K context length, extensible to approximately 1M tokens, with support for 201 languages, unified vision-language understanding, tool calling, and reasoning (thinking) mode...
總上下文:
262K
最大輸出:
262K
輸入:
$
0.39
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
2.34
/ M Tokens

Qwen
Text Generation
Qwen3.5-122B-A10B
Qwen3.5-122B-A10B is a native multimodal large language model from the Qwen team, with 122B total parameters and only 10B activated. It features an efficient hybrid architecture combining Gated Delta Networks with sparse Mixture-of-Experts (MoE), natively supporting a 256K context length extensible up to ~1M tokens. Through early fusion training, it achieves unified vision-language capabilities supporting text, image, and video understanding, with strong performance across knowledge, reasoning, coding, agents, visual understanding, and multilingual benchmarks, surpassing GPT-5-mini and Qwen3-235B-A22B on multiple metrics. It defaults to thinking mode, supports tool calling, and covers 201 languages and dialects...
總上下文:
262K
最大輸出:
262K
輸入:
$
0.26
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
2.08
/ M Tokens

Qwen
Text Generation
Qwen3.5-35B-A3B
Qwen3.5-35B-A3B is a native multimodal large language model from the Qwen team, with 35B total parameters and only 3B activated. It features an efficient hybrid architecture combining Gated Delta Networks with sparse Mixture-of-Experts (MoE), natively supporting a 262K context length extensible up to ~1M tokens. The model achieves unified vision-language capabilities through early fusion training, supporting text, image, and video understanding with strong performance across reasoning, coding, agents, and visual understanding benchmarks. It defaults to thinking mode, supports tool calling, and covers 201 languages and dialects...
總上下文:
262K
最大輸出:
262K
輸入:
$
0.24
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
1.8
/ M Tokens

Qwen
Text Generation
Qwen3.5-27B
Qwen3.5-27B is a native multimodal large language model from the Qwen team with 27B parameters. It features an efficient hybrid architecture combining Gated Delta Networks with Gated Attention, natively supporting a 256K context length extensible up to ~1M tokens. The model achieves unified vision-language capabilities through early fusion training, supporting text, image, and video understanding with strong performance across reasoning, coding, agents, and visual understanding benchmarks, surpassing Qwen3-235B-A22B and GPT-5-mini on multiple metrics. It defaults to thinking mode, supports tool calling, and covers 201 languages and dialects...
總上下文:
262K
最大輸出:
262K
輸入:
$
0.25
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
2.0
/ M Tokens

Qwen
Text Generation
Qwen3.5-9B
Qwen3.5-9B is a native multimodal large language model from the Qwen team with 9B parameters. As a lightweight dense model in the Qwen3.5 series, it features an efficient hybrid architecture combining Gated Delta Networks with Gated Attention, natively supporting a 262K context length extensible up to ~1M tokens. The model achieves unified vision-language capabilities through early fusion training, supporting text, image, and video understanding. It defaults to thinking mode, supports tool calling, and covers 201 languages and dialects...
總上下文:
262K
最大輸出:
262K
輸入:
$
0.1
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
0.15
/ M Tokens

Moonshot AI
Text Generation
Kimi-K2.5
Kimi K2.5 是一個開源的原生 Multimodal 主動 Model,通過在大約 15 萬億混合視覺和 Text token 上的不斷預訓練構建於 Kimi-K2-Base 之上。憑藉 1T 參數 MoE 架構(32B 活躍)和 256K 上下文長度,它無縫集成 Vision 和語言理解,具有先進的主動功能,支持即時和深思模式,以及對話和主動範式...
總上下文:
262K
最大輸出:
262K
輸入:
$
0.45
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
2.25
/ M Tokens
Text Generation
gemma-4-26B-A4B-it
Gemma 4 26B is Google DeepMind's latest open-source MoE model, built on a 26B-parameter Mixture of Experts architecture that activates only 3.8B parameters during inference for exceptionally fast token throughput. Purpose-built for advanced reasoning and agentic workflows, it ranks #6 among all open models on the Arena AI leaderboard — outperforming models up to 20x its size — with native function-calling, 256K context, and full Apache 2.0 licensing....
總上下文:
262K
最大輸出:
262K
輸入:
$
0.12
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
0.4
/ M Tokens
Text Generation
gemma-4-31B-it
Gemma 4 31B is Google DeepMind's latest open-source model, built on a 31B dense architecture from the same research foundation as Gemini 3. Purpose-built for advanced reasoning and agentic workflows, it ranks #3 among all open models on the Arena AI leaderboard — outperforming models up to 20x its size — with native function-calling, 256K context, and full Apache 2.0 licensing....
總上下文:
262K
最大輸出:
262K
輸入:
$
0.13
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
0.4
/ M Tokens

Qwen
Text Generation
Qwen3-VL-32B-Instruct
Qwen3-VL 是 Qwen3 系列中的視覺-語言模型,在各種視覺-語言(VL)基準測試中取得了最先進(SOTA)的表現。該模型支持高達百萬像素的高解析度圖像輸入,並具備強大的一般視覺理解能力、多語言 OCR、細微的視覺定位和視覺對話能力。作為 Qwen3 系列的一部分,它繼承了強大的語言基礎,使其能夠理解和執行複雜的指令。...
總上下文:
262K
最大輸出:
262K
輸入:
$
0.2
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
0.6
/ M Tokens

Qwen
Text Generation
Qwen3-VL-32B-Thinking
Qwen3-VL-Thinking 是 Qwen3-VL 系列中特別優化於複雜視覺推理任務的版本。它融合了一種“思考模式”,使其在提供最終答案之前能夠生成詳細的中間推理步驟(思維鏈)。此設計顯著提高了模型在視覺問答(VQA)和其他視覺-語言任務中需要多步邏輯、規劃和深入分析之性能。...
總上下文:
262K
最大輸出:
262K
輸入:
$
0.2
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
1.5
/ M Tokens

Qwen
Text Generation
Qwen3-VL-8B-Instruct
Qwen3-VL-8B-Instruct 是 Qwen3 系列的視覺-語言模型,展示了在一般視覺理解、以視覺為中心的對話和圖像中的多語言文本識別方面的強大能力。...
總上下文:
262K
最大輸出:
262K
輸入:
$
0.18
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
0.68
/ M Tokens

Qwen
Text Generation
Qwen3-VL-30B-A3B-Instruct
Qwen3-VL系列提供卓越的文本理解與生成、更深入的視覺感知與推理、擴展的上下文長度、增強的空間與視頻動態理解,以及更強的代理互動能力。可提供緻密型和MoE架構,從邊緣計算擴展到雲端,並有指導型和加強推理的Thinking版本。...
總上下文:
262K
最大輸出:
262K
輸入:
$
0.29
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
1.0
/ M Tokens

Qwen
Text Generation
Qwen3-VL-30B-A3B-Thinking
Qwen3-VL系列提供卓越的文本理解與生成、更深入的視覺感知與推理、擴展的上下文長度、增強的空間與視頻動態理解,以及更強的代理互動能力。可提供緻密型和MoE架構,從邊緣計算擴展到雲端,並有指導型和加強推理的Thinking版本。...
總上下文:
262K
最大輸出:
262K
輸入:
$
0.29
/ M Tokens
輸入:
$
text
/ M Tokens
輸出:
$
1.0
/ M Tokens

