GPR: Empowering Generation with Graph-Pretrained Retriever
Xiaochen Wang, Zongyu Wu, Yuan Zhong, Xiang Zhang, Suhang Wang, Fenglong Ma

TL;DR
GPR is a novel graph-based retriever pretrained on knowledge graphs that enhances retrieval accuracy and generation quality in graph retrieval-augmented generation tasks by aligning natural language questions with relevant subgraphs.
Contribution
We introduce GPR, a graph-pretrained retriever that directly learns from knowledge graphs, addressing limitations of text-pretrained models and improving retrieval and generation performance.
Findings
GPR outperforms baselines in retrieval quality.
GPR improves downstream generation results.
GPR is effective across multiple datasets and models.
Abstract
Graph retrieval-augmented generation (GRAG) places high demands on graph-specific retrievers. However, existing retrievers often rely on language models pretrained on plain text, limiting their effectiveness due to domain misalignment and structure ignorance. To address these challenges, we propose GPR, a graph-based retriever pretrained directly on knowledge graphs. GPR aligns natural language questions with relevant subgraphs through LLM-guided graph augmentation and employs a structure-aware objective to learn fine-grained retrieval strategies. Experiments on two datasets, three LLM backbones, and five baselines show that GPR consistently improves both retrieval quality and downstream generation, demonstrating its effectiveness as a robust retrieval solution for GRAG.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeophysical Methods and Applications · Handwritten Text Recognition Techniques · Advanced Image and Video Retrieval Techniques
