TL;DR
This paper introduces Semantic Pyramid Indexing (SPI), a multi-resolution vector indexing framework that adaptively balances retrieval speed and relevance in RAG systems using VecDBs.
Contribution
SPI is a novel, query-adaptive multi-resolution indexing method that improves retrieval efficiency and relevance without offline tuning or separate training.
Findings
SPI achieves up to 5.7× speedup in retrieval
SPI improves memory efficiency by 1.8×
End-to-end QA F1 scores increase by up to 2.5 points
Abstract
Retrieval-Augmented Generation (RAG) systems have become a dominant approach to augment large language models (LLMs) with external knowledge. However, existing vector database (VecDB) retrieval pipelines rely on flat or single-resolution indexing structures, which cannot adapt to the varying semantic granularity required by diverse user queries. This limitation leads to suboptimal trade-offs between retrieval speed and contextual relevance. To address this, we propose \textbf{Semantic Pyramid Indexing (SPI)}, a novel multi-resolution vector indexing framework that introduces query-adaptive resolution control for RAG in VecDBs. Unlike existing hierarchical methods that require offline tuning or separate model training, SPI constructs a semantic pyramid over document embeddings and dynamically selects the optimal resolution level per query through a lightweight classifier. This adaptive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
