Align Documents to Questions: Question-Oriented Document Rewriting for Retrieval-Augmented Generation

Jiaang Li; Zhendong Mao; Quan Wang; Yuning Wan; Yongdong Zhang

arXiv:2604.17325·cs.CL·April 21, 2026

Align Documents to Questions: Question-Oriented Document Rewriting for Retrieval-Augmented Generation

Jiaang Li, Zhendong Mao, Quan Wang, Yuning Wan, Yongdong Zhang

PDF

TL;DR

QREAM is a style-controlled document rewriter that aligns retrieved evidence with question-oriented style, improving factual grounding and relevance in retrieval-augmented generation systems.

Contribution

It introduces a novel two-stage framework for rewriting retrieved documents to better serve LLMs, with a lightweight distilled model for efficient deployment.

Findings

01

QREAM improves RAG performance by up to 8% relative.

02

The framework enhances factual consistency and question relevance.

03

QREAM integrates seamlessly as a plug-and-play module.

Abstract

Retrieval-Augmented Generation (RAG) enhances the factuality of Large Language Models (LLMs) by incorporating retrieved documents and/or generated context. However, LLMs often exhibit a stylistic bias when presented with mixed contexts, favoring fluent but hallucinated generated content over factually grounded yet disorganized retrieved evidence. This phenomenon reveals that the utility of retrieved information is bottlenecked by its presentation. To bridge this gap, we propose QREAM, a style-controlled rewriter that aligns retrieved documents with a question-oriented style while preserving facts, better for LLM readers to utilize. Our framework consists of two stages: (1) QREAM-ICL, which uses stylistic seeds to guide iterative rewriting exploration; and (2) QREAM-FT, a lightweight student model distilled from denoised ICL outputs. QREAM-FT employs dual-criteria rejection sampling,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.