DGAI: Decoupled On-Disk Graph-Based ANN Index for Efficient Updates and Queries
Jiahao Lou, Shufeng Gong, Quan Yu, Hao Guo, Youyou Lu, Song Yu, Yanfeng Zhang, Tiezheng Nie, Ge Yu

TL;DR
DGAI introduces a decoupled on-disk graph-based index for billion-scale ANNS, significantly enhancing update efficiency and query performance through innovative data layout and hierarchical PQ techniques.
Contribution
It proposes a novel decoupled storage architecture with co-designed query techniques, enabling efficient updates and low-latency queries in large-scale ANNS systems.
Findings
8.17x faster insertions and 8.16x faster deletions
67% reduction in peak query latency
Improved resource efficiency in large-scale ANNS
Abstract
On-disk graph-based indexes are favored for billion-scale Approximate Nearest Neighbor Search (ANNS) due to their high performance and cost-efficiency. However, existing systems typically rely on a coupled storage architecture that co-locates vectors and graph topology, which introduces substantial redundant I/O during index updates, thereby degrading usability in dynamic workloads. In this paper, we propose a decoupled storage architecture that physically separates heavy vectors from the lightweight graph topology. This design substantially improves update performance by reducing redundant I/O during updates. However, it introduces I/O amplification during ANNS, leading to degraded query efficiency.To improve query performance within the update-friendly architecture, we propose two techniques co-designed with the decoupled storage. We develop a similarity-aware dynamic layout that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
