Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop   Question Answering

Zhengliang Shi; Weiwei Sun; Shen Gao; Pengjie Ren; Zhumin Chen,; Zhaochun Ren

arXiv:2406.14891·cs.CL·September 17, 2024

Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering

Zhengliang Shi, Weiwei Sun, Shen Gao, Pengjie Ren, Zhumin Chen,, Zhaochun Ren

PDF

Open Access 1 Video

TL;DR

The paper introduces GenGround, a novel generate-then-ground framework that enhances multi-hop question answering by iteratively generating and grounding answers in retrieved documents, outperforming existing methods.

Contribution

It proposes a new generate-then-ground approach for multi-hop QA, combining parametric and external knowledge, with an instructional grounding distillation for smaller models.

Findings

01

GenGround achieves superior performance on four datasets.

02

The iterative generate-then-ground process improves answer accuracy.

03

Instructional grounding distillation enables smaller models to perform effectively.

Abstract

Multi-Hop Question Answering (MHQA) tasks present a significant challenge for large language models (LLMs) due to the intensive knowledge required. Current solutions, like Retrieval-Augmented Generation, typically retrieve potential documents from an external corpus to read an answer. However, the performance of this retrieve-then-read paradigm is constrained by the retriever and the inevitable noise in the retrieved documents. To mitigate these challenges, we introduce a novel generate-then-ground (GenGround) framework, synergizing the parametric knowledge of LLMs and external documents to solve a multi-hop question. GenGround empowers LLMs to alternate two phases until the final answer is derived: (1) formulate a simpler, single-hop question and directly generate the answer; (2) ground the question-answer pair in retrieved documents, amending any wrong predictions in the answer. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering· underline

Taxonomy

TopicsTopic Modeling · Expert finding and Q&A systems · Speech and dialogue systems