Efficient Rationale-based Retrieval: On-policy Distillation from Generative Rerankers based on JEPA

Teng Chen; Sheng Xu; Feixiang Guo; Xiaoyu Wang; Qingqing Gu; Hongyan Li; Luo Ji

arXiv:2604.23336·cs.IR·May 14, 2026

Efficient Rationale-based Retrieval: On-policy Distillation from Generative Rerankers based on JEPA

Teng Chen, Sheng Xu, Feixiang Guo, Xiaoyu Wang, Qingqing Gu, Hongyan Li, Luo Ji

PDF

TL;DR

This paper introduces Rabtriever, an efficient rationale-based retrieval method that distills knowledge from generative rerankers using JEPA, reducing computational costs while maintaining high accuracy across various tasks.

Contribution

It presents a novel on-policy distillation framework with JEPA for rationale-based retrieval, achieving linear complexity and broad applicability.

Findings

01

Rabtriever outperforms baseline retrievers on rationale-based tasks.

02

It maintains high accuracy with significantly reduced computational costs.

03

The approach generalizes well to traditional retrieval benchmarks.

Abstract

Unlike traditional fact-based retrieval, rationale-based retrieval typically necessitates cross-encoding of query-document pairs using large language models, incurring substantial computational costs. To address this limitation, we propose Rabtriever, which independently encodes queries and documents, while providing comparable cross query-document comprehension capabilities to rerankers. We start from training a LLM-based generative reranker, which puts the document prior to the query and prompts the LLM to generate the relevance score by log probabilities. We then employ it as the teacher of an on-policy distillation framework, with Rabtriever as the student to reconstruct the teacher's contextual-aware query embedding. To achieve this effect, Rabtriever is first initialized from the teacher, with parameters frozen. The Joint-Embedding Predictive Architecture (JEPA) paradigm is then…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.