B+ANN: A Fast Billion-Scale Disk-based Nearest-Neighbor Index

Selim Furkan Tekin; Rajesh Bordawekar

arXiv:2511.15557·cs.DB·November 20, 2025

B+ANN: A Fast Billion-Scale Disk-based Nearest-Neighbor Index

Selim Furkan Tekin, Rajesh Bordawekar

PDF

Open Access

TL;DR

The paper introduces B+ANN, a disk-based nearest-neighbor index that improves performance and memory efficiency over existing methods like HNSW and DiskANN, while supporting dissimilarity queries.

Contribution

It presents a novel B+ tree-based disk index that enhances cache locality, reduces memory use, and enables dissimilarity queries in large-scale vector search.

Findings

01

Improves recall and QPS over HNSW

02

Reduces cache misses by 19.23%

03

Decreases build time and memory usage by 24x

Abstract

Storing and processing of embedding vectors by specialized Vector databases (VDBs) has become the linchpin in building modern AI pipelines. Most current VDBs employ variants of a graph-based ap- proximate nearest-neighbor (ANN) index algorithm, HNSW, to an- swer semantic queries over stored vectors. Inspite of its wide-spread use, the HNSW algorithm suffers from several issues: in-memory design and implementation, random memory accesses leading to degradation in cache behavior, limited acceleration scope due to fine-grained pairwise computations, and support of only semantic similarity queries. In this paper, we present a novel disk-based ANN index, B+ANN, to address these issues: it first partitions input data into blocks containing semantically similar items, then builds an B+ tree variant to store blocks both in-memory and on disks, and finally, enables hybrid edge- and block-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms · Advanced Database Systems and Queries · Graph Theory and Algorithms