SAGE: Structure Aware Graph Expansion for Retrieval of Heterogeneous Data
Prasham Titiya, Rohit Khoja, Tomer Wolfson, Vivek Gupta, Dan Roth

TL;DR
SAGE introduces a structure-aware graph expansion method for retrieval in heterogeneous data, enhancing multi-modal evidence retrieval by combining offline graph construction with online neighbor expansion and filtering.
Contribution
The paper presents a novel framework that constructs a chunk-level graph offline and performs online multi-hop retrieval, improving retrieval recall over existing methods.
Findings
SAGE improves retrieval recall by 5.7 points on OTT-QA.
SAGE improves retrieval recall by 8.5 points on STaRK.
The method effectively combines offline graph construction with online neighbor expansion.
Abstract
Retrieval-augmented question answering over heterogeneous corpora requires connected evidence across text, tables, and graph nodes. While entity-level knowledge graphs support structured access, they are costly to construct and maintain, and inefficient to traverse at query time. In contrast, standard retriever-reader pipelines use flat similarity search over independently chunked text, missing multi-hop evidence chains across modalities. We propose SAGE (Structure Aware Graph Expansion) framework that (i) constructs a chunk-level graph offline using metadata-driven similarities with percentile-based pruning, and (ii) performs online retrieval by running an initial baseline retriever to obtain k seed chunks, expanding first-hop neighbors, and then filtering the neighbors using dense+sparse retrieval, selecting k' additional chunks. We instantiate the initial retriever using hybrid…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Graph Theory and Algorithms
