jina-embeddings-v5-text: Task-Targeted Embedding Distillation

Mohammad Kalim Akram; Saba Sturua; Nastia Havriushenko; Quentin Herreros; Michael G\"unther; Maximilian Werk; Han Xiao

arXiv:2602.15547·cs.CL·April 29, 2026

jina-embeddings-v5-text: Task-Targeted Embedding Distillation

Mohammad Kalim Akram, Saba Sturua, Nastia Havriushenko, Quentin Herreros, Michael G\"unther, Maximilian Werk, Han Xiao

PDF

50 Models

TL;DR

This paper presents a novel training approach combining distillation and contrastive loss to create compact, high-performance text embedding models that support long texts and multiple languages.

Contribution

It introduces a new training regimen that outperforms existing methods for small models and provides publicly available weights to foster further research.

Findings

01

Benchmark scores match or exceed state-of-the-art for similar-sized models.

02

Models support long texts up to 32k tokens in many languages.

03

Embeddings remain robust under truncation and quantization.

Abstract

Text embedding models are widely used for semantic similarity tasks, including information retrieval, clustering, and classification. General-purpose models are typically trained with single- or multi-stage processes using contrastive loss functions. We introduce a novel training regimen that combines model distillation techniques with task-specific contrastive loss to produce compact, high-performance embedding models. Our findings suggest that this approach is more effective for training small models than purely contrastive or distillation-based training paradigms alone. Benchmark scores for the resulting models, jina-embeddings-v5-text-small and jina-embeddings-v5-text-nano, exceed or match the state-of-the-art for models of similar size. jina-embeddings-v5-text models additionally support long texts (up to 32k tokens) in many languages, and generate embeddings that remain robust…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.