AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval
Kento Tatsuno, Daisuke Miyashita, Taiga Ikeda, Kiyoshi Ishiyama,, Kazunari Sumiyoshi, Jun Deguchi

TL;DR
AiSAQ introduces a storage-efficient ANNS method that offloads compressed vectors to SSDs, enabling billion-scale retrieval with minimal memory and fast index switching, suitable for large-scale retrieval systems.
Contribution
The paper proposes AiSAQ, a novel approach that offloads product-quantized vectors to SSDs, significantly reducing memory usage and improving index switching speed for billion-scale datasets.
Findings
Achieves ~10 MB memory usage for billion-scale datasets.
Maintains low latency in query search despite offloading.
Enables fast index switching for large-scale retrieval systems.
Abstract
Graph-based approximate nearest neighbor search (ANNS) algorithms work effectively against large-scale vector retrieval. Among such methods, DiskANN achieves good recall-speed tradeoffs using both DRAM and storage. DiskANN adopts product quantization (PQ) to reduce memory usage, which is still proportional to the scale of datasets. In this paper, we propose All-in-Storage ANNS with Product Quantization (AiSAQ), which offloads compressed vectors to the SSD index. Our method achieves 10 MB memory usage in query search with billion-scale datasets without critical latency degradation. AiSAQ also reduces the index load time for query search preparation, which enables fast switch between muitiple billion-scale indices.This method can be applied to retrievers of retrieval-augmented generation (RAG) and be scaled out with multiple-server systems for emerging datasets. Our DiskANN-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Data Storage Technologies · Caching and Content Delivery
MethodsNon Maximum Suppression · 1x1 Convolution · Convolution · SSD
