Rethinking the Reranker: Boundary-Aware Evidence Selection for Robust Retrieval-Augmented Generation

Jiashuo Sun; Pengcheng Jiang; Saizhuo Wang; Jiajun Fan; Heng Wang; Siru Ouyang; Ming Zhong; Yizhu Jiao; Chengsong Huang; Xueqiang Xu; Pengrui Han; Peiran Li; Jiaxin Huang; Ge Liu; Heng Ji; Jiawei Han

arXiv:2602.03689·cs.CL·February 4, 2026

Rethinking the Reranker: Boundary-Aware Evidence Selection for Robust Retrieval-Augmented Generation

Jiashuo Sun, Pengcheng Jiang, Saizhuo Wang, Jiajun Fan, Heng Wang, Siru Ouyang, Ming Zhong, Yizhu Jiao, Chengsong Huang, Xueqiang Xu, Pengrui Han, Peiran Li, Jiaxin Huang, Ge Liu, Heng Ji, Jiawei Han

PDF

Open Access

TL;DR

This paper introduces BAR-RAG, a boundary-aware evidence selector for retrieval-augmented generation that improves robustness and performance under noisy retrieval conditions by focusing on evidence that is optimally challenging for the generator.

Contribution

BAR-RAG redefines the reranker as a boundary-aware evidence selector trained with reinforcement learning to target the generator's optimal evidence, enhancing robustness and accuracy.

Findings

01

Achieves 10.3% average performance gain over baselines.

02

Significantly improves robustness under retrieval noise.

03

Effectively targets evidence that balances difficulty and sufficiency.

Abstract

Retrieval-Augmented Generation (RAG) systems remain brittle under realistic retrieval noise, even when the required evidence appears in the top-K results. A key reason is that retrievers and rerankers optimize solely for relevance, often selecting either trivial, answer-revealing passages or evidence that lacks the critical information required to answer the question, without considering whether the evidence is suitable for the generator. We propose BAR-RAG, which reframes the reranker as a boundary-aware evidence selector that targets the generator's Goldilocks Zone -- evidence that is neither trivially easy nor fundamentally unanswerable for the generator, but is challenging yet sufficient for inference and thus provides the strongest learning signal. BAR-RAG trains the selector with reinforcement learning using generator feedback, and adopts a two-stage pipeline that fine-tunes the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Information Retrieval and Search Behavior