PrismRAG: Boosting RAG Factuality with Distractor Resilience and Strategized Reasoning

Mohammad Kachuee; Teja Gollapudi; Minseok Kim; Yin Huang; Kai Sun; Xiao Yang; Jiaqi Wang; Nirav Shah; Yue Liu; Aaron Colak; Anuj Kumar; Wen-tau Yih; Xin Luna Dong

arXiv:2507.18857·cs.CL·July 28, 2025

PrismRAG: Boosting RAG Factuality with Distractor Resilience and Strategized Reasoning

Mohammad Kachuee, Teja Gollapudi, Minseok Kim, Yin Huang, Kai Sun, Xiao Yang, Jiaqi Wang, Nirav Shah, Yue Liu, Aaron Colak, Anuj Kumar, Wen-tau Yih, Xin Luna Dong

PDF

Open Access

TL;DR

PrismRAG is a fine-tuning framework that enhances retrieval-augmented generation models by making them more resilient to distractors and better at reasoning, leading to improved factual accuracy across multiple benchmarks.

Contribution

It introduces a distractor-aware training method and reasoning-centric habits to improve RAG models without extensive human instructions.

Findings

01

Increases factuality by 5.4% on average across benchmarks.

02

Outperforms state-of-the-art RAG solutions.

03

Effective in diverse application scenarios.

Abstract

Retrieval-augmented generation (RAG) often falls short when retrieved context includes confusing semi-relevant passages, or when answering questions require deep contextual understanding and reasoning. We propose an efficient fine-tuning framework, called PrismRAG, that (i) trains the model with distractor-aware QA pairs mixing gold evidence with subtle distractor passages, and (ii) instills reasoning-centric habits that make the LLM plan, rationalize, and synthesize without relying on extensive human engineered instructions. Evaluated across 12 open-book RAG QA benchmarks spanning diverse application domains and scenarios, PrismRAG improves average factuality by 5.4%, outperforming state-of-the-art solutions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques