Google’s official announcement confirms the formal launch of the next-generation open model series Gemma 4. The model uses the same technical architecture as Gemini 3, fully shifts to a business-friendly Apache 2.0 license, and emphasizes powerful on-device execution capabilities.
(Background: Google’s quantum computer reportedly cracked Bitcoin in 9 minutes—how were those numbers calculated, and where is the real threat?)
(Additional context: AI predicts natural disasters》Google launches the “Groundsource” framework, using Gemini to convert global news into 2.6 million life-saving data points)
Table of Contents
Toggle
Google has once again thrown down a major shockwave in the open-source AI field. The official announcement says it has released the “Gemma 4” series, positioning it as the most intelligent open model in its lineup to date. Gemma 4 directly inherits the world-class research techniques of the flagship model Gemini 3, delivering breakthrough reasoning capabilities and agentic workflows. What has drawn the most attention from the community is that Google has responded to developers’ demands this time—fully switching to the commercially friendly Apache 2.0 license—so users can freely build and securely deploy in any environment, with full control over their own data and infrastructure.
We just released Gemma 4 — our most intelligent open models to date.
Built from the same world-class research as Gemini 3, Gemma 4 brings breakthrough intelligence directly to your own hardware for advanced reasoning and agentic workflows.
Released under a commercially… pic.twitter.com/W6Tvj9CuHW
— Google (@Google) April 2, 2026
To meet different hardware and application scenarios, Gemma 4 is released in four different sizes. The lightest E2B (2B parameters) is designed for mobile devices and edge devices like web browsers; E4B (4B parameters) strikes a balance between performance and efficiency, and natively supports visual and audio input even more. Meanwhile, on the high-performance end, the 26B A4B uses a Mixture of Experts (MoE) architecture—during inference, only about 4B parameters are activated—dramatically reducing memory requirements, so it can run smoothly even on consumer-grade hardware like a Mac Mini with 24GB of memory. The top-tier 31B dense model is the performance flagship of the series.
In terms of technical specifications, the large-model versions of Gemma 4 support a context window of up to 256K tokens, enabling developers to process an entire codebase or massive document data in one go. In addition to native support for text and image processing (E2B and E4B also support audio), Gemma 4 also features strong native Function Calling capability, allowing it to reliably output structured JSON format, providing an excellent foundation for building autonomous agent applications. Furthermore, its training data covers more than 140 languages, giving it highly global applicability.
Gemma 4 emphasizes extremely high “per-token efficiency.” According to data from open-model leaderboards such as AI Arena, Gemma-4-31B currently ranks 3rd among open models. Its overall performance is even comparable to the large-scale Qwen3.5-397B, but its size is only one-tenth of the latter. On the graduate-level reasoning benchmark GPQA Diamond, the 31B version also set an impressive record with 84.3%.
Let’s look at how the open model Gemma has progressed across its last three versions.
– Gemma 4 ranks 100 places above Gemma 3
– Gemma 3 ranks 87 above Gemma 2All three models from @GoogleDeepMind are roughly the same size (Gemma-4-31B, Gemma-3-27B, Gemma-2-27B), and these gains came only 9 and 13… https://t.co/9JnbveYzwT pic.twitter.com/JQtTz09Y1A
— Arena.ai (@arena) April 2, 2026
Currently, developers can directly experience Gemma 4 on Google AI Studio, or download the weights from platforms like Hugging Face and Ollama. The community has also quickly responded by releasing quantized versions optimized for GPU deployment. However, some developers have noted that in real-world complex coding and debugging environments, Gemma 4 still has room for optimization. Overall, this open-source release has undoubtedly injected new momentum into advancing digital sovereignty and local AI applications.