Annotation-Free Reinforcement Learning Query Rewriting via Verifiable Search Reward
Sungguk Cha, DongWook Kim, Taeseung Hahn, Mintae Kim, Youngsub Han, Byoung-Ki Jeon

TL;DR
RL-QR is an innovative reinforcement learning framework for query rewriting that eliminates the need for human annotations by using verifiable search rewards, significantly improving retrieval performance across multiple datasets and modalities.
Contribution
It introduces RL-QR, a novel annotation-free reinforcement learning approach for query rewriting that leverages synthetic search rewards, broadening applicability and reducing annotation costs.
Findings
Up to 3.9× improvement on lexical retrievers
Up to 3.5× improvement on semantic retrievers
5-10% performance gains on benchmark datasets
Abstract
Optimizing queries for Retrieval-Augmented Generation (RAG) systems poses a significant challenge, particularly across diverse modal indices. We introduce RL-QR, a novel annotation-free reinforcement learning framework for query rewriting that eliminates the need for costly human-annotated data. By leveraging verifiable search rewards derived from index-aligned synthetic queries, RL-QR overcomes human-annotation dependencies, extending its applicability to various modalities and index domains. Experimental results demonstrate the framework's robustness, achieving substantial retrieval performance gains of up to 3.9 on lexical retrievers and 3.5 on semantic retrievers on the MTEB VIDORE V2 benchmark for unstructured visual documents, along with consistent 5\% to 10\% improvements on MS MARCO v2.1 and internal industrial datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Advanced Image and Video Retrieval Techniques · Information Retrieval and Search Behavior
