jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval
Michael G\"unther, Saba Sturua, Mohammad Kalim Akram, Isabelle Mohr, Andrei Ungureanu, Bo Wang, Sedigheh Eslami, Scott Martens, Maximilian Werk, Nan Wang, Han Xiao

TL;DR
Jina-embeddings-v4 is a large multimodal embedding model that unifies text and image representations, supporting various retrieval tasks and excelling in processing visually rich content, with a new benchmark for evaluation.
Contribution
The paper introduces jina-embeddings-v4, a 3.8 billion parameter multimodal model with a novel architecture and task-specific adapters, achieving state-of-the-art results in multimodal retrieval.
Findings
Achieves state-of-the-art performance on multimodal retrieval tasks.
Excels in processing visually rich content like tables and diagrams.
Introduces Jina-VDR, a benchmark for visually rich image retrieval.
Abstract
We introduce jina-embeddings-v4, a 3.8 billion parameter multimodal embedding model that unifies text and image representations through a novel architecture supporting both single-vector and multi-vector embeddings in the late interaction style. The model incorporates task-specific Low-Rank Adaptation (LoRA) adapters to optimize performance across diverse retrieval scenarios, including query-document retrieval, semantic text similarity, and code search. Comprehensive evaluations demonstrate that jina-embeddings-v4 achieves state-of-the-art performance on both single-modal and cross-modal retrieval tasks, with particular strength in processing visually rich content such as tables, charts, diagrams, and mixed-media formats. To facilitate evaluation of this capability, we also introduce Jina-VDR, a novel benchmark specifically designed for visually rich image retrieval.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗jinaai/jina-embeddings-v4-mlx-8bitmodel· 292 dl· ♡ 2292 dl♡ 2
- 🤗jinaai/jina-embeddings-v4model· 336k dl· ♡ 493336k dl♡ 493
- 🤗jinaai/jina-embeddings-v4-vllm-retrievalmodel· 33k dl· ♡ 3233k dl♡ 32
- 🤗jinaai/jina-embeddings-v4-vllm-text-matchingmodel· 367 dl· ♡ 7367 dl♡ 7
- 🤗jinaai/jina-embeddings-v4-vllm-codemodel· 303 dl· ♡ 3303 dl♡ 3
- 🤗ashrielbrian/jina-embeddings-v4model· 12 dl12 dl
- 🤗KrishnaIndukuri/IRMSEmbeddingsV4model· 16 dl16 dl
- 🤗remodlai/nova-embeddings-v1model· 13 dl13 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
