Google DeepMind Releases Gemini Embedding 2, a Multimodal Embedding Model

March 11, 2026
Google DeepMind has launched Gemini Embedding 2, its first natively multimodal embedding model that maps text, images, video, audio, and documents into a unified embedding space. The model is available in public preview through the Gemini API and Vertex AI.
Google DeepMind Releases Gemini Embedding 2, a Multimodal Embedding Model

Google's DeepMind team has introduced Gemini Embedding 2, its first fully multimodal embedding model built on the Gemini architecture. The model maps text, images, video, audio, and documents into a single embedding space, enabling unified semantic understanding across various media types. It is now available in public preview through the Gemini API and Vertex AI.

Gemini Embedding 2 supports over 100 languages and allows developers to embed multiple modalities in a single request. It can process text inputs up to 8,192 tokens, handle up to six images per request in PNG or JPEG formats, and process up to 120 seconds of video in MP4 or MOV formats. The model also embeds audio data directly and supports PDFs up to six pages long.

The model uses Matryoshka Representation Learning, which allows dynamic scaling of embedding dimensions from the default 3,072 down to smaller configurations for optimized performance and storage. DeepMind recommends using 3,072, 1,536, or 768 dimensions for best results.

According to the announcement, Gemini Embedding 2 achieves state-of-the-art performance across text, image, and video benchmarks, offering improved multimodal depth and speech understanding. Developers can access the model via the Gemini API or Vertex AI, and it integrates with tools such as LangChain, LlamaIndex, Haystack, Weaviate, QDrant, and ChromaDB.

We hope you enjoyed this article.

Consider subscribing to one of our newsletters like Daily AI Brief.

Also, consider following us on social media:

Subscribe to Daily AI Brief

Daily report covering major AI developments and industry news, with both top stories and complete market updates

Market report

2025 Generative AI in Professional Services Report

Thomson Reuters

This report by Thomson Reuters explores the integration and impact of generative AI technologies, such as ChatGPT and Microsoft Copilot, within the professional services sector. It highlights the growing adoption of GenAI tools across industries like legal, tax, accounting, and government, and discusses the challenges and opportunities these technologies present. The report also examines professionals' perceptions of GenAI and the need for strategic integration to maximize its value.

Read more