Google Releases Gemma 4 as Competition in Open Models Heats Up
Senior AI Reporter
April 2, 2026
Google has released Gemma 4, a new generation of its open-weight AI models, introducing four variants designed for local and on-device use while shifting to a more permissive Apache 2.0 license.
The update gives developers access to models ranging from mobile-scale to larger systems capable of running on high-end GPUs. The lineup includes Effective 2B and 4B models aimed at smartphones and edge devices, alongside 26B Mixture of Experts and 31B Dense versions built for more demanding local workloads. Google said the models are optimized to run efficiently on hardware from mobile devices to single 80GB Nvidia H100 GPUs, with quantized versions available for consumer-grade systems.
Gemma 4 builds on the same underlying technology as Google’s Gemini models, with improvements in reasoning, math performance, and instruction following. The models also support agent-style workflows, including function calling, structured outputs, and integrations with external tools and APIs.
Google is positioning the release around performance relative to size. The company said the 31B model ranks third among open models on the Arena leaderboard, while the 26B variant ranks sixth, despite being significantly smaller than competing systems.
The smaller E2B and E4B models are designed for low-latency, on-device use. Google said it worked with hardware partners including Qualcomm and MediaTek to optimize these versions for smartphones and edge devices, enabling offline processing with minimal memory and power usage.
Gemma 4 also expands multimodal capabilities. The models can process images and video inputs, handle tasks like OCR and chart analysis, and support speech recognition on the smaller variants. Context windows have increased to 128,000 tokens for edge models and up to 256,000 tokens for larger versions, with support for more than 140 languages.
A key change is the move to an Apache 2.0 license, replacing earlier restrictions that developers had criticized as overly limiting. The new license removes many usage constraints and allows broader commercial use, giving developers more control over deployment and data.
Google said the models are available through tools including AI Studio and AI Edge Gallery, with downloadable weights on platforms like Hugging Face, Kaggle, and Ollama.
See more here:
This analysis is based on reporting from Ars Technica.
Image courtesy of Google.
This article was generated with AI assistance and reviewed for accuracy and quality.
About this article: This article was generated with AI assistance and reviewed by our editorial team to ensure it follows our editorial standards for accuracy and independence. We maintain strict fact-checking protocols and cite all sources.
Word count: 374Reading time: 0 minutes
AI Tools for this Article
Settings
Neutral, broadcast-style delivery (default)
Standard speaking pace
📧 Stay Updated
Get the latest AI news delivered to your inbox every morning.
Stay Ahead of AI
Get the latest AI breakthroughs, tools, and insights delivered to your inbox every week.