Google DeepMind Releases Gemini Embedding 2, a Multimodal Embedding Model

March 11, 2026

Google DeepMind has launched Gemini Embedding 2, its first natively multimodal embedding model that maps text, images, video, audio, and documents into a unified embedding space. The model is available in public preview through the Gemini API and Vertex AI.

Google's DeepMind team has introduced Gemini Embedding 2, its first fully multimodal embedding model built on the Gemini architecture. The model maps text, images, video, audio, and documents into a single embedding space, enabling unified semantic understanding across various media types. It is now available in public preview through the Gemini API and Vertex AI.

Gemini Embedding 2 supports over 100 languages and allows developers to embed multiple modalities in a single request. It can process text inputs up to 8,192 tokens, handle up to six images per request in PNG or JPEG formats, and process up to 120 seconds of video in MP4 or MOV formats. The model also embeds audio data directly and supports PDFs up to six pages long.

The model uses Matryoshka Representation Learning, which allows dynamic scaling of embedding dimensions from the default 3,072 down to smaller configurations for optimized performance and storage. DeepMind recommends using 3,072, 1,536, or 768 dimensions for best results.

According to the announcement, Gemini Embedding 2 achieves state-of-the-art performance across text, image, and video benchmarks, offering improved multimodal depth and speech understanding. Developers can access the model via the Gemini API or Vertex AI, and it integrates with tools such as LangChain, LlamaIndex, Haystack, Weaviate, QDrant, and ChromaDB.

We hope you enjoyed this article.

Consider subscribing to one of our newsletters like Daily AI Brief.

Also, consider following us on social media:

AI Brief AI Brief (X)

Subscribe to Daily AI Brief

Daily report covering major AI developments and industry news, with both top stories and complete market updates

Market report

2025 Generative AI in Professional Services Report

Thomson Reuters

This report by Thomson Reuters explores the integration and impact of generative AI technologies, such as ChatGPT and Microsoft Copilot, within the professional services sector. It highlights the growing adoption of GenAI tools across industries like legal, tax, accounting, and government, and discusses the challenges and opportunities these technologies present. The report also examines professionals' perceptions of GenAI and the need for strategic integration to maximize its value.

Categories

Companies

Resources

Google DeepMind Releases Gemini Embedding 2, a Multimodal Embedding Model

We hope you enjoyed this article.

Subscribe to Daily AI Brief

Market report

2025 Generative AI in Professional Services Report

You May Also Like

Google Releases Gemma 4 12B for Local Multimodal AI on Laptops

Google Introduces DiffusionGemma for Faster Text Generation

JK Tech Launches Gemini Enterprise Unit to Accelerate AI Adoption

Z.ai Releases GLM 5.2 Open Model with 1M Context and MIT License

Lovable Expands Collaboration with Google Cloud to Scale AI Software Creation

Google Antigravity Agents Build Functional Operating System with Gemini 3.5

MWM and Google Cloud Launch AI Mobile Squad for App Creation

Cohere Releases Command A+, an Open Source Multimodal Reasoning Model

Digital Turbine Expands AI Capabilities with Google Cloud Partnership

Cendyn Introduces Wayfinder to Track Hotel Visibility Across AI Platforms

GMI Cloud Expands Support for NVIDIA Vera Rubin AI Factory Platform

Exabase M-1 Tops LongMemEval Benchmark Using Smaller, Cheaper Model