TL;DR
The paper introduces $ ext{ extbackslash}Psi$-RAG, a hierarchical tree-based retrieval-augmented generation framework that improves cross-document multi-hop question answering by adaptive indexing and multi-granular retrieval strategies.
Contribution
It proposes a novel hierarchical abstract tree index and a multi-granular retrieval agent to enhance cross-document retrieval in RAG systems, addressing key limitations of existing methods.
Findings
Outperforms RAPTOR by 25.9% in F1 score on cross-document QA benchmarks.
Outperforms HippoRAG 2 by 7.4% in F1 score.
Supports diverse tasks from token-level QA to document summarization.
Abstract
Retrieval-augmented generation (RAG) enhances large language models with external knowledge, and tree-based RAG organizes documents into hierarchical indexes to support queries at multiple granularities. However, existing Tree-RAG methods designed for single-document retrieval face critical challenges in scaling to cross-document multi-hop questions: (1) poor distribution adaptability, where -means clustering introduces noise due to rigid distribution assumptions; (2) structural isolation, as tree indexes lack explicit cross-document connections; and (3) coarse abstraction, which obscures fine-grained details. To address these limitations, we propose -RAG, a tree-RAG framework with two key components. First, a hierarchical abstract tree index built through an iterative "merging and collapse" process that adapts to data distributions without a priori assumption. Second, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
