d-HNSW: A High-performance Vector Search Engine on Disaggregated Memory
Fei Fang, Yi Liu, Chen Qian

TL;DR
d-HNSW is a novel RDMA-based vector search engine optimized for disaggregated memory systems, overcoming unique system challenges to deliver high accuracy, low latency, and high throughput for large-scale AI applications.
Contribution
It introduces hardware-algorithm co-designed techniques specifically for disaggregated memory architectures, enabling efficient vector search with improved performance and resource utilization.
Findings
Achieves up to 100x query throughput
Reduces query latency by over 10x
Maintains 94% recall in large-scale tests
Abstract
Efficient vector search is essential for powering large-scale AI applications, such as LLMs. Existing solutions are designed for monolithic architectures where compute and memory are tightly coupled. Recently, disaggregated architecture breaks this coupling by separating compution and memory resources into independently scalable pools to improve utilization. However, applying vector database on disaggregated memory system brings unique challenges to system design due to its graph-based index. We present d-HNSW, the first RDMA-based vector search engine optimized for disaggregated memory systems. d-HNSW preserves HNSW's high accuracy while addressing the new system-level challenges introduced by disaggregation: 1) network inefficiency from pointer-chasing traversals, 2) non-contiguous remote memory layout induced by dynamic insertions, 3) redundant data transfers in batch workloads, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Graph Theory and Algorithms · Advanced Database Systems and Queries
