Gemini Embedding: Generalizable Embeddings from Gemini
Jinhyuk Lee, Feiyang Chen, Sahil Dua, Daniel Cer, Madhuri Shanbhogue,, Iftekhar Naim, Gustavo Hern\'andez \'Abrego, Zhe Li, Kaifeng Chen, Henrique, Schechter Vera, Xiaoqi Ren, Shanfeng Zhang, Daniel Salz, Michael Boratko, Jay, Han, Blair Chen, Shuo Huang, Vikram Rao

TL;DR
Gemini Embedding is a new multilingual embedding model leveraging Google's Gemini LLM, producing highly generalizable text representations that outperform previous models across diverse languages and tasks.
Contribution
This paper introduces Gemini Embedding, a state-of-the-art multilingual embedding model based on Google's Gemini LLM, demonstrating superior performance on a comprehensive benchmark.
Findings
Outperforms prior models on MMTEB benchmark
Achieves state-of-the-art results in multilingual and code tasks
Demonstrates broad applicability across various downstream tasks
Abstract
In this report, we introduce Gemini Embedding, a state-of-the-art embedding model leveraging the power of Gemini, Google's most capable large language model. Capitalizing on Gemini's inherent multilingual and code understanding capabilities, Gemini Embedding produces highly generalizable embeddings for text spanning numerous languages and textual modalities. The representations generated by Gemini Embedding can be precomputed and applied to a variety of downstream tasks including classification, similarity, clustering, ranking, and retrieval. Evaluated on the Massive Multilingual Text Embedding Benchmark (MMTEB), which includes over one hundred tasks across 250+ languages, Gemini Embedding substantially outperforms prior state-of-the-art models, demonstrating considerable improvements in embedding quality. Achieving state-of-the-art performance across MMTEB's multilingual, English, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and cultural evolution
