GROUNDEDKG-RAG: Grounded Knowledge Graph Index for Long-document Question Answering
Tianyi Zhang, Andreas Marfurt

TL;DR
GroundedKG-RAG is a knowledge graph-based retrieval-augmented system for long-document question answering that improves factual accuracy, efficiency, and interpretability by grounding in source text.
Contribution
It introduces a novel grounding method using explicit knowledge graphs constructed from SRL and AMR parses for long-document QA.
Findings
Performs on par with state-of-the-art models at lower cost
Outperforms a competitive baseline in accuracy
Provides interpretable and human-readable knowledge graphs
Abstract
Retrieval-augmented generation (RAG) systems have been widely adopted in contemporary large language models (LLMs) due to their ability to improve generation quality while reducing the required input context length. In this work, we focus on RAG systems for long-document question answering. Current approaches suffer from a heavy reliance on LLM descriptions resulting in high resource consumption and latency, repetitive content across hierarchical levels, and hallucinations due to no or limited grounding in the source text. To improve both efficiency and factual accuracy through grounding, we propose GroundedKG-RAG, a RAG system in which the knowledge graph is explicitly extracted from and grounded in the source document. Specifically, we define nodes in GroundedKG as entities and actions, and edges as temporal or semantic relations, with each node and edge grounded in the original…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
