When More Reformulations Hurt: Avoiding Drift using Ranker Feedback
V Venktesh, Mandeep Rathee, Avishek Anand

TL;DR
ReformIR is a budget-aware retrieval framework that adaptively selects reformulations and documents to improve recall while minimizing query drift, using online relevance estimation and a surrogate model.
Contribution
This work introduces ReformIR, a novel framework that optimally balances reformulation diversity and relevance under strict inference budgets, outperforming existing strategies.
Findings
ReformIR outperforms existing reformulation strategies on MSMARCO and TREC benchmarks.
The approach effectively suppresses query drift through online feature selection.
ReformIR maintains high recall with fewer reformulations compared to naive methods.
Abstract
Modern retrieval pipelines increasingly rely on query reformulation and neural reranking to improve effectiveness, but this comes at a significant computational cost and introduces a fundamental tradeoff between recall and query drift. Generating many reformulated queries can substantially increase recall, yet naively merging or exhaustively reranking their results is prohibitively expensive. In this work, we argue that the core challenge is not reformulation generation itself, but the adaptive selection of reformulations and their retrieved documents under a strict inference budget. We propose ReformIR, a budget-aware retrieval framework that treats query reformulations as first-class features and performs online relevance estimation using a strong reranker as a teacher. Given multiple reformulated queries, ReformIR constructs a large candidate pool and learns a lightweight surrogate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
