Align Documents to Questions: Question-Oriented Document Rewriting for Retrieval-Augmented Generation
Jiaang Li, Zhendong Mao, Quan Wang, Yuning Wan, Yongdong Zhang

TL;DR
QREAM is a style-controlled document rewriter that aligns retrieved evidence with question-oriented style, improving factual grounding and relevance in retrieval-augmented generation systems.
Contribution
It introduces a novel two-stage framework for rewriting retrieved documents to better serve LLMs, with a lightweight distilled model for efficient deployment.
Findings
QREAM improves RAG performance by up to 8% relative.
The framework enhances factual consistency and question relevance.
QREAM integrates seamlessly as a plug-and-play module.
Abstract
Retrieval-Augmented Generation (RAG) enhances the factuality of Large Language Models (LLMs) by incorporating retrieved documents and/or generated context. However, LLMs often exhibit a stylistic bias when presented with mixed contexts, favoring fluent but hallucinated generated content over factually grounded yet disorganized retrieved evidence. This phenomenon reveals that the utility of retrieved information is bottlenecked by its presentation. To bridge this gap, we propose QREAM, a style-controlled rewriter that aligns retrieved documents with a question-oriented style while preserving facts, better for LLM readers to utilize. Our framework consists of two stages: (1) QREAM-ICL, which uses stylistic seeds to guide iterative rewriting exploration; and (2) QREAM-FT, a lightweight student model distilled from denoised ICL outputs. QREAM-FT employs dual-criteria rejection sampling,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
