Introducing Gemma 4 12B: a unified, encoder-free multimodal model
Jun 03, 2026
Gemma 4 12B is designed to bring high-performance multimodal intelligence directly to your laptop, combining mobile-first efficiency with advanced reasoning.
Olivier Lacombe
Director of Product Management, Google Deepmind
Gus Martins
Product Manager, Google DeepMind
Listen to article
[[duration]] minutes
Today, we are introducing Gemma 4 12B, our latest model designed to bring agentic multimodal intelligence directly to laptops. Bridging the gap between our edge-friendly E4B and our more advanced 26B Mixture of Experts (MoE), Gemma 4 12B packages powerful capabilities inside a reduced memory footprint. It is also our first mid-sized model to feature native audio inputs.
Thanks to the developer community, Gemma 4 models have now crossed 150 million downloads. You’ve built everything from wearable robotic arms for physical assistance to enterprise-grade AI security. We're excited to see what you build with this latest addition.
Here’s an overview of what makes Gemma 4 12B unique:
- ...
Copyright of this story solely belongs to blog.google. To see the full text click HERE