Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering
Zhengliang Shi, Weiwei Sun, Shen Gao, Pengjie Ren, Zhumin Chen,, Zhaochun Ren

TL;DR
The paper introduces GenGround, a novel generate-then-ground framework that enhances multi-hop question answering by iteratively generating and grounding answers in retrieved documents, outperforming existing methods.
Contribution
It proposes a new generate-then-ground approach for multi-hop QA, combining parametric and external knowledge, with an instructional grounding distillation for smaller models.
Findings
GenGround achieves superior performance on four datasets.
The iterative generate-then-ground process improves answer accuracy.
Instructional grounding distillation enables smaller models to perform effectively.
Abstract
Multi-Hop Question Answering (MHQA) tasks present a significant challenge for large language models (LLMs) due to the intensive knowledge required. Current solutions, like Retrieval-Augmented Generation, typically retrieve potential documents from an external corpus to read an answer. However, the performance of this retrieve-then-read paradigm is constrained by the retriever and the inevitable noise in the retrieved documents. To mitigate these challenges, we introduce a novel generate-then-ground (GenGround) framework, synergizing the parametric knowledge of LLMs and external documents to solve a multi-hop question. GenGround empowers LLMs to alternate two phases until the final answer is derived: (1) formulate a simpler, single-hop question and directly generate the answer; (2) ground the question-answer pair in retrieved documents, amending any wrong predictions in the answer. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Expert finding and Q&A systems · Speech and dialogue systems
