Catapults to the Rescue: Accelerating Vector Search by Exploiting Query Locality
Sami Abuzakuk, Anne-Marie Kermarrec, Rafael Pires, Mathis Randl, Martijn de Vos

TL;DR
CatapultDB dynamically exploits query locality in vector search by adding shortcut edges, significantly improving throughput and efficiency without altering existing algorithms or features.
Contribution
It introduces a novel, lightweight mechanism to adaptively determine optimal search starting points in graph-based ANN indices based on query patterns.
Findings
Up to 2.51x throughput improvement over DiskANN.
Maintains high recall and efficiency comparable to LSH-based methods.
Adapts effectively to workload shifts, outperforming cache-based solutions.
Abstract
Graph-based indexing is the dominant approach for approximate nearest neighbor search in vector databases, offering high recall with low latency across billions of vectors. However, in such indices, the edge set of the proximity graph is only modified to reflect changes in the indexed data, never to adapt to the query workload. This is wasteful: real-world query streams exhibit strong spatial and temporal locality, yet every query must re-traverse the same intermediate hops from fixed or random entry points. We present CatapultDB, a lightweight mechanism that, for the first time, dynamically determines where to begin the search in an ANN index on the fly, therefore exploiting query locality. CatapultDB injects shortcut edges called catapults that connect query regions to frequently visited destination nodes. Catapults are maintained as an additional layer on top of the graph, so the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Database Systems and Queries · Graph Theory and Algorithms
