Gemini Embedding 2: Google's First Natively Multimodal Embedding Model Released

Google DeepMind has released Gemini Embedding 2 in public preview, their first fully multimodal embedding model built on the Gemini architecture. Unlike previous text-only models, this one maps text, images, videos, audio, and documents into a single, unified embedding space, capturing semantic intent across over 100 languages.
Key Technical Details
The model is available through the Gemini API and Vertex AI, and supports these specific capabilities:
- Text: Supports context of up to 8192 input tokens
- Images: Processes up to 6 images per request (PNG and JPEG formats)
- Videos: Supports up to 120 seconds of video input (MP4 and MOV formats)
- Audio: Natively ingests and embeds audio without needing text transcriptions
- Documents: Directly embeds PDFs up to 6 pages long
Beyond processing single modalities, the model natively understands interleaved input, allowing you to pass multiple modalities (e.g., image + text) in a single request to capture nuanced relationships between different media types.
Flexible Output Dimensions
Gemini Embedding 2 incorporates Matryoshka Representation Learning (MRL), enabling flexible output dimensions scaling down from the default 3072. This lets developers balance performance and storage costs. Google recommends using 3072, 1536, or 768 dimensions for highest quality.
Integration and Use Cases
The model is designed for multimodal downstream tasks including Retrieval-Augmented Generation (RAG), semantic search, sentiment analysis, and data clustering. It's available through multiple platforms:
- Gemini API
- Vertex AI
- LangChain, LlamaIndex, Haystack
- Vector databases: Weaviate, QDrant, ChromaDB, and Vector Search
Google provides interactive Colab notebooks for getting started with the Gemini API and Vertex AI implementations.
📖 Read the full source: HN AI Agents
👀 See Also

US Military Pressures Anthropic to Remove Claude Safeguards for Military Use
US military leaders including Defense Secretary Pete Hegseth met with Anthropic executives to demand removal of Claude's safeguards against military applications like mass surveillance and autonomous weapons. The Pentagon has given Anthropic until Friday to comply or face penalties including contract cancellation.

Ubuntu Linux to Integrate AI Features Over the Next Year, Starting with Local Inferencing
Canonical announces a multi-year AI push for Ubuntu, focusing on local inferencing, agentic workflows, and context-aware OS capabilities, with features rolling out throughout 2026.

Anthropic Launches Remote Control for Claude Code
Anthropic has launched remote control functionality for Claude Code, allowing users to continue coding sessions from mobile devices. The feature is documented at code.claude.com/docs/en/remote-control.

Atlassian Announces 1,600 Layoffs as Part of AI Pivot
Atlassian plans to cut approximately 1,600 jobs as the company shifts its focus toward AI development, according to a Reuters report shared on Hacker News.