ReFeed: Retrieval Feedback-Guided Dataset Construction for Style-Aware Query Rewriting
Jiyoon Myung, Jungki Son, Kyungro Lee, Jihyeon Park, Joohyung Han

TL;DR
This paper presents ReFeed, a framework that uses retrieval feedback and large language models to generate style-aware query rewrites, improving retrieval accuracy in domain-specific contexts.
Contribution
ReFeed introduces a novel retrieval feedback-driven dataset construction method that incorporates document style into query rewriting for better retrieval performance.
Findings
Generated query pairs improve retrieval accuracy.
Style-aware rewrites enhance domain-specific retrieval.
Feedback loop ensures relevance and style alignment.
Abstract
Retrieval systems often fail when user queries differ stylistically or semantically from the language used in domain documents. Query rewriting has been proposed to bridge this gap, improving retrieval by reformulating user queries into semantically equivalent forms. However, most existing approaches overlook the stylistic characteristics of target documents-their domain-specific phrasing, tone, and structure-which are crucial for matching real-world data distributions. We introduce a retrieval feedback-driven dataset generation framework that automatically identifies failed retrieval cases, leverages large language models to rewrite queries in the style of relevant documents, and verifies improvement through re-retrieval. The resulting corpus of (original, rewritten) query pairs enables the training of rewriter models that are explicitly aware of document style and retrieval feedback.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsInformation Retrieval and Search Behavior · Topic Modeling · Biomedical Text Mining and Ontologies
