Google DeepMind Introduces EmbeddingGemma for On-Device AI

September 05, 2025
Google DeepMind has unveiled EmbeddingGemma, a new open embedding model designed for efficient on-device AI applications. With 308 million parameters, it supports multilingual tasks and operates offline, making it ideal for mobile and desktop devices.

DeepMind has introduced EmbeddingGemma, a new open embedding model designed to deliver high performance for on-device AI applications. This model, with its 308 million parameter design, is optimized for efficiency and privacy, allowing applications to run directly on hardware without requiring an internet connection.

EmbeddingGemma is notable for its compact size, which is comparable to models nearly twice its size, and its ability to operate on less than 200MB of RAM with quantization. It supports over 100 languages and offers customizable output dimensions, making it suitable for a variety of devices, including mobile phones, laptops, and desktops.

The model is integrated with popular tools such as sentence-transformers, transformers.js, and Weaviate, facilitating easy adoption by developers. EmbeddingGemma is particularly effective in applications like Retrieval Augmented Generation (RAG) and semantic search, where it generates high-quality embeddings that enhance the accuracy and reliability of on-device applications.

By leveraging Matryoshka Representation Learning, EmbeddingGemma provides multiple embedding sizes from a single model, offering flexibility in terms of speed and storage. This makes it a versatile choice for developers looking to build privacy-centric, offline-enabled applications.

We hope you enjoyed this article.

Consider subscribing to one of our newsletters like Daily AI Brief or AI Programming Weekly.

Also, consider following us on social media:

Subscribe to AI Programming Weekly

Weekly news about AI tools for software engineers, AI enabled IDE's and much more.

Market report

2025 Generative AI in Professional Services Report

Thomson Reuters

This report by Thomson Reuters explores the integration and impact of generative AI technologies, such as ChatGPT and Microsoft Copilot, within the professional services sector. It highlights the growing adoption of GenAI tools across industries like legal, tax, accounting, and government, and discusses the challenges and opportunities these technologies present. The report also examines professionals' perceptions of GenAI and the need for strategic integration to maximize its value.

Read more