Google Releases Gemini Embedding 2 for Multimodal Data Processing

March 10, 2026

Google on Tuesday introduced Gemini Embedding 2, a new artificial intelligence model that maps text, images, video, audio and documents into a shared embedding space, expanding the company’s Gemini family of models with a system designed for multimodal data processing.

The model, now available in public preview through the Gemini API and Vertex AI, allows developers to generate embeddings from multiple types of input within a single framework. Google said the system captures semantic relationships across more than 100 languages, helping power tasks such as semantic search, Retrieval-Augmented Generation (RAG), sentiment analysis and data clustering.

Unlike earlier embedding systems focused primarily on text, Gemini Embedding 2 supports several types of media. The model can process up to 8,192 input tokens of text, analyze up to six images per request in PNG or JPEG formats, and handle video clips up to 120 seconds long in MP4 or MOV formats. It can also embed audio without requiring transcription and directly process PDF documents up to six pages in length.

Google said the model can also interpret interleaved inputs, allowing developers to combine modalities in a single request, such as submitting both an image and descriptive text together. The goal is to capture relationships between different types of information within a unified representation.

“Gemini Embedding 2 maps text, images, videos, audio and documents into a single, unified embedding space, and captures semantic intent across over 100 languages,” Google said in a blog post announcing the release.

The company added that the model builds on Gemini’s broader multimodal capabilities. “Gemini Embedding 2 doesn’t just improve on legacy models,” Google said. “It establishes a new performance standard for multimodal depth, introducing strong speech capabilities and outperforming leading models in text, image, and video tasks.”

Gemini Embedding 2 includes a technique called Matryoshka Representation Learning, which allows developers to adjust the size of the output embedding. The default output dimension is 3,072, but it can be scaled down to 1,536 or 768 dimensions depending on performance and storage requirements.

Google said embeddings produced by the system are designed to support large-scale information retrieval and organization. The company noted that some early partners are already applying the model to analyze complex datasets containing multiple types of media.

Developers can access the model through Google’s AI platforms or integrate it with tools such as LangChain, LlamaIndex, Haystack, Weaviate, Qdrant, ChromaDB and Vector Search.

This analysis is based on reporting from Google.

Images courtesy of Google.

This article was generated with AI assistance and reviewed for accuracy and quality.

Last updated: March 10, 2026

Report Error

About this article: This article was generated with AI assistance and reviewed by our editorial team to ensure it follows our editorial standards for accuracy and independence. We maintain strict fact-checking protocols and cite all sources.

Word count: 420Reading time: 0 minutes

Explore More AI Resources

Continue with high-value guides related to this topic.

Compare AI Models

See ChatGPT, Claude, and Gemini side-by-side in one place.

Best AI Newsletters

Find top AI newsletters and subscribe to ChatAI Weekly.

AI FAQ

Quick answers about ChatAI, AI tools, and multi-model chat.

AI Tools

Use free AI tools for summarization, translation, and more.

AI Tools for this Article

📧 Stay Updated

Get the latest AI news delivered to your inbox every morning.

Continue Reading

Meta Bets on AWS Graviton to Power Next-Gen AI Infrastructure

Meta has agreed to deploy AWS Graviton processors at large scale, starting with tens of millions of cores, as the company expands the infrastructure behind its next generation of AI systems. The deal...

April 24, 2026•5 min read

Claude Managed Agents Now Remember: Anthropic Expands Enterprise AI Capabilities

Anthropic has introduced memory for Claude Managed Agents in public beta, adding a persistent storage layer that allows AI agents to retain information across sessions and improve performance over...

April 24, 2026•5 min read

Google’s New AI Chips Signal Major Shift in the Battle Against Nvidia

Google is redesigning its artificial intelligence chips to separate training and inference workloads, marking a shift in how the company builds hardware as it competes more directly with Nvidia. The...

April 22, 2026•5 min read

Explore All Articles

Google Releases Gemini Embedding 2 for Multimodal Data Processing

Explore More AI Resources

Compare AI Models

Best AI Newsletters

AI FAQ

AI Tools

AI Tools for this Article

Settings

📧 Stay Updated

Related Articles

Meta Bets on AWS Graviton to Power Next-Gen AI Infrastructure

Claude Managed Agents Now Remember: Anthropic Expands Enterprise AI Capabilities

Google’s New AI Chips Signal Major Shift in the Battle Against Nvidia

Continue Reading

Meta Bets on AWS Graviton to Power Next-Gen AI Infrastructure

Claude Managed Agents Now Remember: Anthropic Expands Enterprise AI Capabilities

Google’s New AI Chips Signal Major Shift in the Battle Against Nvidia

AI News Daily

Stay Ahead of AI

Go Premium

Follow Our Community

ChatAI

AI News Daily

Go Premium

ChatAI

Follow Our Community