AlayaLaser: Efficient Index Layout and Search Strategy for Large-scale High-dimensional Vector Similarity Search
Weijian Chen, Haotian Liu, Yangshen Deng, Long Xiang, Liang Huang, Gezi Li, Bo Tang

TL;DR
AlayaLaser is a novel on-disk graph-based index system that optimizes compute-bound operations in high-dimensional vector similarity search, significantly improving performance over existing methods.
Contribution
The paper introduces AlayaLaser, a new on-disk index system with a novel data layout and optimization techniques that address compute-bound bottlenecks in high-dimensional ANNS.
Findings
AlayaLaser outperforms existing on-disk graph-based index systems.
AlayaLaser matches or exceeds in-memory index performance.
Performance improvements are validated on large-scale high-dimensional datasets.
Abstract
On-disk graph-based approximate nearest neighbor search (ANNS) is essential for large-scale, high-dimensional vector retrieval, yet its performance is widely recognized to be limited by the prohibitive I/O costs. Interestingly, we observed that the performance of on-disk graph-based index systems is compute-bound, not I/O-bound, with the rising of the vector data dimensionality (e.g., hundreds or thousands). This insight uncovers a significant optimization opportunity: existing on-disk graph-based index systems universally target I/O reduction and largely overlook computational overhead, which leaves a substantial performance improvement space. In this work, we propose AlayaLaser, an efficient on-disk graph-based index system for large-scale high-dimensional vector similarity search. In particular, we first conduct performance analysis on existing on-disk graph-based index systems via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
