GPU-Accelerated ANNS: Quantized for Speed, Built for Change
Hunter McCoy, Zikun Wang, Prashant Pandey

TL;DR
Jasper is a GPU-native ANNS system that significantly improves query throughput, supports fast dataset updates, and reduces memory usage through novel algorithms and quantization techniques, addressing key limitations of existing GPU-based solutions.
Contribution
Jasper introduces a lock-free batch construction algorithm, an efficient quantization method, and an optimized search kernel, enabling high-performance, updatable, and memory-efficient GPU-accelerated ANNS.
Findings
Jasper achieves up to 1.93x higher query throughput than CAGRA.
Jasper constructs indices 2.4x faster than CAGRA.
Jasper delivers 19-131x faster queries compared to BANG.
Abstract
Approximate nearest neighbor search (ANNS) is a core problem in machine learning and information retrieval applications. GPUs offer a promising path to high-performance ANNS: they provide massive parallelism for distance computations, are readily available, and can co-locate with downstream applications. Despite these advantages, current GPU-accelerated ANNS systems face three key limitations. First, real-world applications operate on evolving datasets that require fast batch updates, yet most GPU indices must be rebuilt from scratch when new data arrives. Second, high-dimensional vectors strain memory bandwidth, but current GPU systems lack efficient quantization techniques that reduce data movement without introducing costly random memory accesses. Third, the data-dependent memory accesses inherent to greedy search make overlapping compute and memory difficult, leading to reduced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Data Management and Algorithms · Graph Theory and Algorithms
