目錄

TL; DR: Gemma-4-31B and Gemma-4-26B-A4B are now available on SiliconFlow. The Gemma 4 family of multimodal models by Google DeepMind delivers frontier-level performance across text, image, video, and audio inputs, with different sizes optimized for different hardware requirements. Achieving an Arena AI score of 1452 (31B) and 1441 (26B A4B), Gemma 4 excels in reasoning, coding, and agentic workflows. Start building with SiliconFlow's API today to power your AI product.
Overview: Maximize Intelligence-per-parameter
Gemma 4, Google DeepMind's latest family of open multimodal models purpose-built for advanced reasoning and agentic workflows, is now officially live on SiliconFlow.
This release brings comprehensive multimodal capabilities and exceptional performance across diverse benchmarks, enabling developers to achieve seamless integration of vision, audio, and text understanding with greater efficiency and reliability. Built from Gemini 3 and as Google's most intelligent open models, Gemma 4 is truly open with Apache 2.0 licenses.
Through SiliconFlow's Gemma 4 API, You Can Expect
Cost-effective Pricing: Gemma-4-31B: $0.13/M tokens (Input), $0.40/M tokens (Output); Gemma-4-26B-A4B: $0.12/M tokens (Input), $0.40/M tokens (Output)
Up to 262K Context Window: Perfect for long documents, complex reasoning, and extended agentic tasks
Advanced Reasoning & agentic workflows: Native function calling for autonomous agents like Hermes Agent, OpenClaw, and Claude Code
Developer-Ready Integration: Instant compatibility with your existing stack, deploy via SiliconFlow's OpenAI/Anthropic-compatible API through Claude Code, Gen-CLI and Cline; ready-to-use in Dify, ChatHub, Chatbox, Sider, and also available through OpenRouter
Core Capabilities & Benchmark Performance
The different model sizes and precisions represent a set of trade-offs for AI application. Gemma 4 resolves this through architecture innovation and comprehensive training, achieving both frontier-level performance and true deployment versatility.
Reasoning: All models in the family are designed as highly capable reasoners, with configurable thinking modes
Diverse & Efficient Architectures: Offers a powerful 31B parameter dense model, and a highly efficient 26B MoE model
Frontier Multimodal Performance: Achieves Arena AI text score of 1452 (31B) and 1441 (26B A4B with just 4B active parameters), placing it among the world's top-tier models
Comprehensive Modality Support: Natively processes text, images, video, and audio inputs (E2B/E4B support all modalities; larger models support text+vision)
Enhanced Reasoning & Coding: 89.2% on AIME 2026, 80.0% on LiveCodeBench v6, and 86.4% on τ²-Bench agentic tool use (31B model)
Truly Open Licensing: Apache 2.0 license permits responsible commercial use, fine-tuning, and deployment without restrictions


Gemma 4 consistently outperforms previous-generation models including Gemma 3 27B across reasoning, coding, vision, and agentic benchmarks, while maintaining deployment flexibility that frontier closed models cannot match. The 26B A4B mixture-of-experts model delivers near-31B performance with only 4B active parameters, making frontier capabilities accessible on consumer hardware.
Real-World Applications
From reasoning and coding to vision and long-context tasks, Gemma 4 empowers developers to push the boundaries of AI across every dimension.
Vision-Language Coding Assistants: Developers can provide screenshots of UI elements, design mockups, or application interfaces, and receive accurate HTML/CSS code generation, GUI element detection with bounding boxes, or object detection—enabling rapid prototyping and automated design-to-code workflows. The models natively respond in structured JSON format without requiring specific instructions
Multimodal Agentic Systems: Build intelligent agents that combine vision and text understanding for complex tool use scenarios. Gemma 4 excels at multimodal function calling, allowing agents to analyze images and invoke appropriate tools based on visual context
Possible Use-cases: Generate creative text formats, text summarization, chatbots and conversational AI, image data extraction, research and education…
Get Started Immediately
Explore: Try Gemma-4-31B or Gemma-4-26B-A4B in the SiliconFlow playground.
Integrate: Use our OpenAI-compatible API. Explore the full API specifications in the SiliconFlow API documentation.
Join our Discord community now →
