Research on Graph-Retrieval Augmented Generation Based on Historical Text Knowledge Graphs
Yang Fan, Zhang Qi, Xing Wenqian, Liu Chang, Liu Liu

TL;DR
This paper introduces Graph RAG, a framework combining graph-based retrieval and generation techniques to improve historical text analysis, knowledge extraction, and reduce manual annotation costs in computational humanities.
Contribution
It proposes a novel Graph RAG framework that integrates knowledge graphs with retrieval-augmented generation for historical texts, supported by a new character relationship dataset and collaborative mechanisms.
Findings
Xunzi-Qwen1.5-14B achieves F1=0.68 in relation extraction.
DeepSeek with GraphRAG improves F1 by 11%, surpassing baseline models.
Framework reduces manual annotation and enhances interpretability.
Abstract
This article addresses domain knowledge gaps in general large language models for historical text analysis in the context of computational humanities and AIGC technology. We propose the Graph RAG framework, combining chain-of-thought prompting, self-instruction generation, and process supervision to create a The First Four Histories character relationship dataset with minimal manual annotation. This dataset supports automated historical knowledge extraction, reducing labor costs. In the graph-augmented generation phase, we introduce a collaborative mechanism between knowledge graphs and retrieval-augmented generation, improving the alignment of general models with historical knowledge. Experiments show that the domain-specific model Xunzi-Qwen1.5-14B, with Simplified Chinese input and chain-of-thought prompting, achieves optimal performance in relation extraction (F1 = 0.68). The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Semantic Web and Ontologies
