Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems

Yilun Zhao; Jinbiao Wei; Tingyu Song; Siyue Zhang; Chen Zhao; Arman Cohan

arXiv:2605.04018·cs.CL·May 6, 2026

Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems

Yilun Zhao, Jinbiao Wei, Tingyu Song, Siyue Zhang, Chen Zhao, Arman Cohan

PDF

1 Repo

TL;DR

This paper introduces a new benchmark and training corpus for reasoning-intensive retrieval, emphasizing evidence diversity and agentic search, and demonstrates improved retriever performance through aspect-aware evaluation and fine-tuning.

Contribution

It presents BRIGHT-Pro, an expanded benchmark with multi-aspect evidence, and RTriever-Synth, a synthetic corpus for training, advancing retrieval evaluation and training methods.

Findings

01

RTriever-4B outperforms its base model in experiments.

02

Aspect-aware evaluation reveals behaviors hidden by standard metrics.

03

BRIGHT-Pro provides multi-aspect gold evidence for better assessment.

Abstract

Reasoning-intensive retrieval aims to surface evidence that supports downstream reasoning rather than merely matching topical similarity. This capability is increasingly important for agentic search systems, where retrievers must provide complementary evidence across iterative search and synthesis. However, existing work remains limited on both evaluation and training: benchmarks such as BRIGHT provide narrow gold sets and evaluate retrievers in isolation, while synthetic training corpora often optimize single-passage relevance rather than evidence portfolio construction. We introduce BRIGHT-Pro, an expert-annotated benchmark that expands each query with multi-aspect gold evidence and evaluates retrievers under both static and agentic search protocols. We further construct RTriever-Synth, an aspect-decomposed synthetic corpus that generates complementary positives and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yale-nlp/Bright-Pro
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.