Pruning Minimal Reasoning Graphs for Efficient Retrieval-Augmented Generation

Ning Wang; Kuanyan Zhu; Daniel Yuehwoon Yee; Yitang Gao; Shiying Huang; Zirun Xu; Sainyam Galhotra

arXiv:2602.04926·cs.DB·February 6, 2026

Pruning Minimal Reasoning Graphs for Efficient Retrieval-Augmented Generation

Ning Wang, Kuanyan Zhu, Daniel Yuehwoon Yee, Yitang Gao, Shiying Huang, Zirun Xu, Sainyam Galhotra

PDF

Open Access

TL;DR

AutoPrunedRetriever is a graph-based RAG system that efficiently stores and updates minimal reasoning graphs for knowledge-intensive tasks, reducing token usage and latency while maintaining high accuracy.

Contribution

It introduces a novel graph pruning and extension method for RAG systems, enabling incremental reasoning with minimal storage and computation.

Findings

01

Achieves state-of-the-art accuracy on GraphRAG-Benchmark.

02

Reduces token usage by up to two orders of magnitude.

03

Maintains high reasoning performance on complex benchmarks.

Abstract

Retrieval-augmented generation (RAG) is now standard for knowledge-intensive LLM tasks, but most systems still treat every query as fresh, repeatedly re-retrieving long passages and re-reasoning from scratch, inflating tokens, latency, and cost. We present AutoPrunedRetriever, a graph-style RAG system that persists the minimal reasoning subgraph built for earlier questions and incrementally extends it for later ones. AutoPrunedRetriever stores entities and relations in a compact, ID-indexed codebook and represents questions, facts, and answers as edge sequences, enabling retrieval and prompting over symbolic structure instead of raw text. To keep the graph compact, we apply a two-layer consolidation policy (fast ANN/KNN alias detection plus selective $k$ -means once a memory threshold is reached) and prune low-value structure, while prompts retain only overlap representatives and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Multimodal Machine Learning Applications