Guaranteeing Knowledge Integration with Joint Decoding for Retrieval-Augmented Generation

Zhengyi Zhao; Shubo Zhang; Zezhong Wang; Yuxi Zhang; Huimin Wang; Yutian Zhao; Yefeng Zheng; Binyang Li; Kam-Fai Wong; Xian Wu

arXiv:2604.08046·cs.CL·April 16, 2026

Guaranteeing Knowledge Integration with Joint Decoding for Retrieval-Augmented Generation

Zhengyi Zhao, Shubo Zhang, Zezhong Wang, Yuxi Zhang, Huimin Wang, Yutian Zhao, Yefeng Zheng, Binyang Li, Kam-Fai Wong, Xian Wu

PDF

TL;DR

GuarantRAG introduces a novel framework for retrieval-augmented generation that explicitly separates reasoning from evidence integration, improving accuracy and reducing hallucinations in large language models.

Contribution

The paper proposes a new joint decoding approach that decouples reasoning from evidence integration, enhancing factual accuracy and reducing hallucinations in RAG systems.

Findings

01

Improves QA accuracy by up to 12.1% over baselines.

02

Reduces hallucinations by 16.3% compared to standard RAG.

03

Demonstrates effectiveness across five QA benchmarks.

Abstract

Retrieval-Augmented Generation (RAG) significantly enhances Large Language Models (LLMs) by providing access to external knowledge. However, current research primarily focuses on retrieval quality, often overlooking the critical ''integration bottleneck'': even when relevant documents are retrieved, LLMs frequently fail to utilize them effectively due to conflicts with their internal parametric knowledge. In this paper, we argue that implicitly resolving this conflict in a single generation pass is suboptimal. We introduce GuarantRAG, a framework that explicitly decouples reasoning from evidence integration. First, we generate an ''Inner-Answer'' based solely on parametric knowledge to capture the model's reasoning flow. Second, to guarantee faithful evidence extraction, we generate a ''Refer-Answer'' using a novel Contrastive DPO objective. This objective treats the parametric…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.