Progressive Searching for Retrieval in RAG
Taehee Jeong, Xingzhe Zhao, Peizu Li, Markus Valvur, Weihua Zhao

TL;DR
This paper introduces a progressive searching algorithm for Retrieval Augmented Generation (RAG) systems that improves retrieval efficiency and accuracy by incrementally refining search results through multiple embedding hierarchies.
Contribution
The paper presents a novel multi-stage search method that reduces retrieval time while maintaining high accuracy in RAG systems, enhancing scalability for large databases.
Findings
Achieves faster retrieval with maintained accuracy
Balances dimensionality, speed, and accuracy effectively
Enables scalable high-performance retrieval in RAG
Abstract
Retrieval Augmented Generation (RAG) is a promising technique for mitigating two key limitations of large language models (LLMs): outdated information and hallucinations. RAG system stores documents as embedding vectors in a database. Given a query, search is executed to find the most related documents. Then, the topmost matching documents are inserted into LLMs' prompt to generate a response. Efficient and accurate searching is critical for RAG to get relevant information. We propose a cost-effective searching algorithm for retrieval process. Our progressive searching algorithm incrementally refines the candidate set through a hierarchy of searches, starting from low-dimensional embeddings and progressing into a higher, target-dimensionality. This multi-stage approach reduces retrieval time while preserving the desired accuracy. Our findings demonstrate that progressive search in RAG…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Information Retrieval and Search Behavior · Multimodal Machine Learning Applications
