Distribution-Aware Exploration for Adaptive HNSW Search
Chao Zhang, Ren\'ee J. Miller

TL;DR
This paper introduces Ada-ef, a dynamic, data-driven method for configuring the exploration factor in HNSW-based approximate nearest neighbor search, improving efficiency and recall guarantees on real-world high-dimensional data.
Contribution
We propose a theoretically grounded, query-adaptive approach for setting ef in HNSW, addressing the limitations of static configurations and enhancing search performance.
Findings
Achieves target recall with minimal computation
Reduces online query latency by up to 4x
Decreases offline computation and memory usage significantly
Abstract
Hierarchical Navigable Small World (HNSW) is widely adopted for approximate nearest neighbor search (ANNS) for its ability to deliver high recall with low latency on large-scale, high-dimensional embeddings. The exploration factor, commonly referred to as ef, is a key parameter in HNSW-based vector search that balances accuracy and efficiency. However, existing systems typically rely on manually and statically configured ef values that are uniformly applied across all queries. This results in a distribution-agnostic configuration that fails to account for the non-uniform and skewed nature of real-world embedding data and query workloads. As a consequence, HNSW-based systems suffer from two key practical issues: (i) the absence of recall guarantees, and (ii) inefficient ANNS performance due to over- or under-searching. In this paper, we propose Adaptive-ef (Ada-ef), a data-driven,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Information Retrieval and Search Behavior · Advanced Neural Network Applications
