Fast Graph Vector Search via Hardware Acceleration and Delayed-Synchronization Traversal
Wenqi Jiang, Hang Hu, Torsten Hoefler, Gustavo Alonso

TL;DR
This paper introduces Falcon, a hardware accelerator, and DST, a traversal algorithm, to significantly reduce latency and improve energy efficiency in graph-based vector search systems used in LLMs and search engines.
Contribution
The paper presents a co-designed hardware and algorithm approach, combining Falcon FPGA accelerator and DST traversal to enhance GVS performance and energy efficiency.
Findings
Up to 4.3x lower latency compared to CPU systems.
Up to 8.0x energy efficiency improvement over GPU systems.
Prototyped on FPGAs, demonstrating significant performance gains.
Abstract
Vector search systems are indispensable in large language model (LLM) serving, search engines, and recommender systems, where minimizing online search latency is essential. Among various algorithms, graph-based vector search (GVS) is particularly popular due to its high search performance and quality. However, reducing GVS latency by intra-query parallelization remains challenging due to limitations imposed by both existing hardware architectures (CPUs and GPUs) and the inherent difficulty of parallelizing graph traversals. To efficiently serve low-latency GVS, we co-design hardware and algorithm by proposing Falcon and Delayed-Synchronization Traversal (DST). Falcon is a hardware GVS accelerator that implements efficient GVS operators, pipelines these operators, and reduces memory accesses by tracking search states with an on-chip Bloom filter. DST is an efficient graph traversal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Algorithms and Data Compression · Graph Theory and Algorithms
