Nex-N2-Pro Now on SiliconFlow: Agentic Thinking, Frontier Coding, and Long-Horizon Execution

5 июн. 2026 г.

Содержание

TL; DR: Nex-N2-Pro, the flagship of nex-agi's next-generation agentic model family, is now live on SiliconFlow. Built on an Agentic Thinking framework and post-trained on Qwen3.5-397B-A17B, it turns reasoning into executable, verifiable, and iterable action across agentic coding, deep research, tool calling, and terminal execution. It scores 75.3 on Terminal-Bench 2.1 (ahead of Opus 4.7 and DeepSeek-V4-Pro), edges GPT-5.5 on SWE-Bench Pro (58.8), and hits 83.7 on BrowseComp, keeping pace with top-tier frontier systems while shipping fully open-weight under Apache-2.0. Start building with SiliconFlow's API today.

Try Nex-N2-Pro on SiliconFlow

Free for the First 2 Weeks.
Seamless Integration: Instant compatibility with your existing development ecosystem, deploy via SiliconFlow's OpenAI-Compatible API through Cline, Gen-CLI, Kilo Code, Roo Code ; Anthropic-Compatible API with Claude Code; plug into agents like OpenClaw, Hermes Agent; ready-to-use in Dify, Janitor AI, Chub AI, ChatHub, Chatbox, Sider; and also available through OpenRouter.

Introduction

The competition between next-generation models is no longer about whether a model can think. It is about whether a model can reliably and efficiently turn thinking into actions that are executable, verifiable, and iterable. Over the past year, the rise of Vibe Coding and Harness Engineering has pushed agents from dialogue and reasoning toward long-horizon tasks with real environmental feedback — longer contexts, harder problems, and messier execution environments where a single brittle step breaks the whole chain.

Nex-N2-Pro is nex-agi's answer to that shift, and it is now available on SiliconFlow. Rather than treating reasoning, tool use, and environment execution as three separate capabilities bolted together, Nex-N2-Pro unifies them into a single closed loop through its Agentic Thinking framework. The result is an agent model built for real-world productivity: it keeps driving complex, multi-step tasks forward in live environments and delivers stable, end-to-end results instead of an impressive demo that stalls on the second tool call.

Benchmark Performance

Nex-N2-Pro is evaluated the way it is meant to be used — inside real agentic workflows, across three directions: agentic tasks, coding tasks, and general tasks. The evaluation suite spans tool calling, search-based decision-making, software engineering, and terminal execution, and the headline result is consistent: Nex-N2-Pro keeps pace with top-tier proprietary systems such as GPT-5.5 and Opus 4.7, while leading or matching the strongest open-weight models. It is especially strong on coding and long-horizon execution, and it shows standout generalization on newer, harder benchmarks like SWE-Atlas and DeepSWE.

Terminal execution that leads its class. 75.3 on Terminal-Bench 2.1 — ahead of Opus 4.7 (69.7), DeepSeek-V4-Pro (72.0), and GLM-5.1 (58.7).
Frontier software engineering. 58.8 on SWE-Bench Pro edges GPT-5.5 (58.6), and 80.8 on SWE-Bench Verified sits right alongside the strongest open peers.
Deep research and browsing. 83.7 on BrowseComp clears Opus 4.7 (79.8) and runs neck-and-neck with GPT-5.5 (84.4) and DeepSeek-V4-Pro (83.4).
Long-horizon productivity. 1585 on GDPval, reflecting the model's ability to sustain multi-step, real-world economic tasks rather than one-shot answers.
Generalization to new benchmarks. 40.0 on SWE Atlas TW beats both Opus 4.7 (38.2) and MiniMax M3 (30.8), evidence that the gains hold up outside the well-trodden evaluations.
Solid core reasoning. 90.7 on GPQA Diamond and 94.0 on IFEval keep it competitive with leading frontier models on general capability.

Innovations: Agentic Thinking Framework

The core innovation in Nex-N2-Pro is treating an agentic task as a single continuous loop instead of a hand-off between disconnected modules. The Agentic Thinking framework connects requirement understanding, task planning, code implementation, environmental feedback, evaluation and debugging, and continuous iteration — so the model reads a goal, plans, acts, observes what the environment returns, debugs, and tries again without losing the thread. This is what lets Nex-N2-Pro sustain long-horizon work where each step depends on the verified outcome of the last.

The framework has two complementary parts:

Adaptive Thinking lets the model decide on its own when to think and how deeply. It executes simple actions quickly and reserves thorough reasoning for critical decisions that actually move a task forward. The payoff is efficiency — 30~50% fewer wasted tokens on trivial steps, more deliberation where a mistake would be expensive.
Coherent Thinking carries one consistent reasoning paradigm across general reasoning and diverse agentic tasks. Because the model reasons the same way whether it is browsing, writing code, or driving a terminal, capability transfers cleanly across tasks and modalities instead of fragmenting per domain.

Real-World Applications

Benchmarks are the proxy; the point is the work. In real productivity scenarios, Nex-N2-Pro is built to own the full job end to end:

One-person-company workflows (OpenClaw). Nex-N2-Pro can sit at the center of an OpenClaw-style operation, decomposing a high-level objective into planning, execution, and iteration across multiple tools. It coordinates the many small decisions a solo operator would otherwise make by hand.
End-to-end game development. From a spec to a playable build, the model drives the implement-run-debug cycle that long-horizon coding requires. Its code-plus-environment competence separates a working build from a snippet that won't compile.
Deep research and web tasks. Nex-N2-Pro handles search-based, multi-hop investigation — gathering evidence, cross-checking sources, and synthesizing findings rather than returning a single lookup.
Web and multimodal generation. The model extends into generating web interfaces and multimodal outputs, turning a description into a structured, working artifact as part of a larger agentic pipeline.

「Show Case」Prompt：Design an iOS prototype for a Newbery Medal children's book recommendation app, with 3 core screens that are actually tappable.

Get Started Immediately

Build: Try Nex-N2-Pro on SiliconFlow through the early-access playground.
Integrate: Use our OpenAI-compatible API. Explore the full API specifications in the SiliconFlow API documentation.

import requests

url = "https://api.siliconflow.com/v1/chat/completions"

payload = {
    "model": "nex-agi/Nex-N2-Pro",
    "messages": [
        {
            "role": "user",
            "content": "Please provide information about a person in the following JSON format: {   \\"name\\": \\"string\\",   \\"age\\": \\"number\\",   \\"occupation\\": \\"string\\",   \\"hobbies\\": [\\"string\\"] }  Generate a realistic example."
        }
    ],
    "stream": True,
    "max_tokens": 4095,
    "min_p": 0.05,
    "stop": None,
    "temperature": 0.7,
    "top_p": 0.7,
    "top_k": 50,
    "frequency_penalty": 0.5,
    "n": 1,
    "response_format": { "type": "json_object" },
    "tools": [
        {
            "type": "function",
            "function": {
                "description": "<string>",
                "name": "<string>",
                "parameters": {},
                "strict": False
            }
        }
    ]
}
headers = {
    "Authorization": "Bearer <token>",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers

import requests

url = "https://api.siliconflow.com/v1/chat/completions"

payload = {
    "model": "nex-agi/Nex-N2-Pro",
    "messages": [
        {
            "role": "user",
            "content": "Please provide information about a person in the following JSON format: {   \\"name\\": \\"string\\",   \\"age\\": \\"number\\",   \\"occupation\\": \\"string\\",   \\"hobbies\\": [\\"string\\"] }  Generate a realistic example."
        }
    ],
    "stream": True,
    "max_tokens": 4095,
    "min_p": 0.05,
    "stop": None,
    "temperature": 0.7,
    "top_p": 0.7,
    "top_k": 50,
    "frequency_penalty": 0.5,
    "n": 1,
    "response_format": { "type": "json_object" },
    "tools": [
        {
            "type": "function",
            "function": {
                "description": "<string>",
                "name": "<string>",
                "parameters": {},
                "strict": False
            }
        }
    ]
}
headers = {
    "Authorization": "Bearer <token>",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers

import requests

url = "https://api.siliconflow.com/v1/chat/completions"

payload = {
    "model": "nex-agi/Nex-N2-Pro",
    "messages": [
        {
            "role": "user",
            "content": "Please provide information about a person in the following JSON format: {   \\"name\\": \\"string\\",   \\"age\\": \\"number\\",   \\"occupation\\": \\"string\\",   \\"hobbies\\": [\\"string\\"] }  Generate a realistic example."
        }
    ],
    "stream": True,
    "max_tokens": 4095,
    "min_p": 0.05,
    "stop": None,
    "temperature": 0.7,
    "top_p": 0.7,
    "top_k": 50,
    "frequency_penalty": 0.5,
    "n": 1,
    "response_format": { "type": "json_object" },
    "tools": [
        {
            "type": "function",
            "function": {
                "description": "<string>",
                "name": "<string>",
                "parameters": {},
                "strict": False
            }
        }
    ]
}
headers = {
    "Authorization": "Bearer <token>",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers