Gemma 4, particularly its edge-optimized E2B and E4B variants, can now run fully offline on iPhones through applications like Locally AI and Google AI Edge Gallery. This development allows users to download a ~1.5 GB quantized model, with all inference performed on-device utilizing the Apple Neural Engine. This release is under the Apache 2.0 license, which supports flexible deployment, and recent updates to the mentioned apps enhance the integration of Gemma 4 for efficient offline AI experimentation on mobile devices, leveraging its capabilities for native multimodal inputs and optimized for on-device hardware acceleration.

Gemma 4: Gemma 4 is a family of open models from Google DeepMind, built on Gemini technology to deliver advanced reasoning, agentic workflows, and multimodal support for text, audio, images, and vision inputs. It includes edge-optimized variants such as E2B and E4B designed for deployment on mobile and embedded devices. In this news item, the E2B and E4B variants enable fully offline inference on iPhones using the Apple Neural Engine through apps like Locally AI and Google AI Edge Gallery.
Locally AI: Locally AI is an iOS app that allows users to run AI models such as Gemma directly on iPhones, iPads, and Macs with complete offline functionality and no data collection. It emphasizes privacy by processing everything on-device. The app is referenced in the news for supporting the offline execution of Gemma 4’s edge-optimized variants on iPhones.
Google AI Edge Gallery: Google AI Edge Gallery is an experimental app from Google available on iOS and Android for testing and building on-device generative AI experiences with models like Gemma 4. It enables local model downloads and fully offline multimodal interactions. The news highlights it as a platform for running Gemma 4’s E2B and E4B variants entirely on iPhones via the Apple Neural Engine.

`json
{
“Open License”: “Gemma 4 models are released under a commercially permissive Apache 2.0 license for flexible deployment.”,
“App Demonstrations”: “Recent updates to Google AI Edge Gallery and Locally AI integrate Gemma 4 for seamless offline AI experimentation on mobile devices.”,
“Model Optimization”: “Gemma 4’s edge variants support native multimodal inputs and function calling tailored for on-device hardware acceleration.”
}
`