GROUNDEDKG-RAG: Grounded Knowledge Graph Index for Long-document Question Answering

Tianyi Zhang; Andreas Marfurt

arXiv:2604.04359·cs.CL·April 7, 2026

GROUNDEDKG-RAG: Grounded Knowledge Graph Index for Long-document Question Answering

Tianyi Zhang, Andreas Marfurt

PDF

TL;DR

GroundedKG-RAG is a knowledge graph-based retrieval-augmented system for long-document question answering that improves factual accuracy, efficiency, and interpretability by grounding in source text.

Contribution

It introduces a novel grounding method using explicit knowledge graphs constructed from SRL and AMR parses for long-document QA.

Findings

01

Performs on par with state-of-the-art models at lower cost

02

Outperforms a competitive baseline in accuracy

03

Provides interpretable and human-readable knowledge graphs

Abstract

Retrieval-augmented generation (RAG) systems have been widely adopted in contemporary large language models (LLMs) due to their ability to improve generation quality while reducing the required input context length. In this work, we focus on RAG systems for long-document question answering. Current approaches suffer from a heavy reliance on LLM descriptions resulting in high resource consumption and latency, repetitive content across hierarchical levels, and hallucinations due to no or limited grounding in the source text. To improve both efficiency and factual accuracy through grounding, we propose GroundedKG-RAG, a RAG system in which the knowledge graph is explicitly extracted from and grounded in the source document. Specifically, we define nodes in GroundedKG as entities and actions, and edges as temporal or semantic relations, with each node and edge grounded in the original…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.