Low-Latency Out-of-Core ANN Search in High-Dimensional Space
Ziwen Song, Bin Wang, Xiaochun Yang, Junhua Zhang

TL;DR
SkipDisk is a hybrid disk-memory approach for high-dimensional approximate nearest neighbor search that significantly reduces memory usage while maintaining or improving low-latency search performance.
Contribution
The paper introduces SkipDisk, a novel method combining tight lower bound filtering, data pruning, and asynchronous I/O to optimize disk-memory hybrid ANN search.
Findings
Achieves 85% of HNSW's latency with 10% memory footprint.
Reduces search latency to 63% of HNSW's latency.
Maintains competitive accuracy with significantly less memory.
Abstract
In-memory graph-based approximate nearest neighbor (ANN) search has superior search performance but incurs significant memory footprint. Disk-based methods reduce memory usage but suffer from high disk access latency. A common challenge is how to achieve low-latency search while significantly reducing memory footprint. In this paper, we propose SkipDisk, a disk-memory hybrid ANN search that significantly reduces memory footprint while achieving search latency comparable to or lower than in-memory method HNSW. By analyzing existing disk-based methods, we observed that disk access remains the primary bottleneck, and existing lower bound based filtering methods are two loose to effectively reduce disk access. Therefore, we design SkipDisk to achieve tight lower bound with low memory footprint to reduce the search latency. First, we design a dedicated pivot for each point to improve the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
