Google's open-weight Gemma 4 hits #3 on Arena AI under Apache 2.0, with major benchmark gains over its predecessor.
Google DeepMind's **31-billion-parameter Gemma 4** model now sits third in the world on the Arena AI leaderboard, scoring **1,452 Elo points** and outperforming models with 20 times more parameters. Google DeepMind announced the full Gemma 4 family on April 2, 2026, built on the architecture powering Gemini 3. The 26B Mixture-of-Experts variant placed sixth globally with 1,441 Elo, putting two Gemma 4 models among the top six AI systems on the planet.
## Four Models, One Strategy
The lineup covers the full hardware spectrum. The E2B (2.3 billion effective parameters) and E4B (4.5 billion) run directly on Android smartphones via AI Edge Gallery and LiteRT-LM for fully offline inference. The 26B MoE and 31B Dense variants target workstations and consumer RTX GPUs, according to Google's Developers Blog.
**Clément Farabet**, VP of Research at Google DeepMind, described Gemma 4 as "our most intelligent open models to date, built specifically for advanced reasoning and agentic workflows." The benchmark jumps from Gemma 3 are striking: on AIME 2026 (Olympic-level mathematics), the score leapt from 20.8% to 89.2%. LiveCodeBench (competitive programming) climbed from **29.1% to 80.0%**, and GPQA Diamond (graduate-level science) rose from **42.4% to 84.3%**.
## The Licensing Shift That Changes Everything
The benchmark numbers matter, but the licensing change may matter more. For the first time, Google is releasing Gemma under the Apache 2.0 license, dropping the restrictive proprietary terms that had blocked previous versions from many commercial applications. Developers can now freely use, modify, and redistribute the models without conditions.
That puts Gemma 4 in direct competition with Meta's LLaMA and Mistral for open-source dominance, as VentureBeat reported. The models are already available on Hugging Face, Kaggle, and Ollama. The practical implication is straightforward: a startup building an AI product no longer has to choose between frontier reasoning capability and API costs.
## Multimodal at the Edge
Gemma 4 handles images, audio, handwritten OCR, document analysis, chart recognition, and speech comprehension. The larger models support context windows up to **256K tokens** and more than **140 languages**, according to Google's official release.
The on-device story is where this gets genuinely significant for the broader AI ecosystem. Phone-grade inference with frontier-level reasoning closes a gap that has kept autonomous AI agents tethered to cloud infrastructure. As Engadget noted, the combination of compact model sizes and competitive benchmark performance is a direct signal that capable AI agents on consumer hardware are no longer a future promise. The acceleration of on-device AI is no longer theoretical — it is happening now.
---SOURCES---
- Gemma 4: Byte for byte, the most capable open models — Google Blog
- Gemma 4 Model Page — Google DeepMind
- Bring state-of-the-art agentic skills to the edge with Gemma 4 — Google Developers Blog
- Gemma 4: Expanding the Gemmaverse with Apache 2.0 — Google Open Source Blog
- Google releases Gemma 4 under Apache 2.0 — VentureBeat
- Google releases Gemma 4, a family of open models built off of Gemini 3 — Engadget
- Welcome Gemma 4: Frontier multimodal intelligence on device — Hugging Face
- Arena AI Leaderboard
