$\text{R}^2\text{R}$: A Route-to-Rerank Post-Training Framework for Multi-Domain Decoder-Only Rerankers

Xinyu Wang; Hanwei Wu; Qingchen Hu; Zhenghan Tai; Jingrui Tian; Lei Ding; Jijun Chi; Hailin He; Tung Sum Thomas Kwok; Yufei Cui; Sicheng Lyu; Muzhi Li; Mingze Li; Xinyue Yu; Ling Zhou; Peng Lu

arXiv:2511.19987·cs.CL·November 26, 2025

$\text{R}^2\text{R}$: A Route-to-Rerank Post-Training Framework for Multi-Domain Decoder-Only Rerankers

Xinyu Wang, Hanwei Wu, Qingchen Hu, Zhenghan Tai, Jingrui Tian, Lei Ding, Jijun Chi, Hailin He, Tung Sum Thomas Kwok, Yufei Cui, Sicheng Lyu, Muzhi Li, Mingze Li, Xinyue Yu, Ling Zhou, Peng Lu

PDF

Open Access

TL;DR

R2R is a flexible, domain-aware reranking framework that improves multi-domain retrieval tasks by combining expert routing and entity abstraction to enhance relevance understanding across high-stakes fields.

Contribution

The paper introduces R2R, a novel framework integrating dynamic expert routing and EAG to improve domain-specific reranking without overfitting or forgetting.

Findings

01

R2R outperforms generalist and single-domain models across legal, medical, and financial domains.

02

EAG effectively prevents overfitting by masking surface cues, promoting domain-invariant learning.

03

The Latent Semantic Router efficiently activates domain experts, enhancing model robustness.

Abstract

Decoder-only rerankers are central to Retrieval-Augmented Generation (RAG). However, generalist models miss domain-specific nuances in high-stakes fields like finance and law, and naive fine-tuning causes surface-form overfitting and catastrophic forgetting. To address this challenge, we introduce R2R, a domain-aware framework that combines dynamic expert routing with a two-stage training strategy, Entity Abstraction for Generalization (EAG). EAG introduces a counter-shortcut mechanism by masking the most predictive surface cues, forcing the reranker to learn domain-invariant relevance patterns rather than memorizing dataset-specific entities. To efficiently activate domain experts, R2R employs a lightweight Latent Semantic Router that probes internal representations from the frozen backbone decoder to select the optimal LoRA expert per query. Extensive experiments across different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Domain Adaptation and Few-Shot Learning