CANDY: A Benchmark for Continuous Approximate Nearest Neighbor Search with Dynamic Data Ingestion
Xianzhi Zeng, Zhuoyan Wu, Xinjing Hu, Xuanhua Shi, Shixuan Sun, Shuhao, Zhang

TL;DR
CANDY introduces a comprehensive benchmark for evaluating AKNN algorithms in dynamic data environments, emphasizing update efficiency and real-world applicability, with findings that simpler methods often outperform complex ones.
Contribution
The paper presents CANDY, a novel benchmark for continuous AKNN search that incorporates dynamic data ingestion and advanced optimizations, addressing limitations of static benchmarks.
Findings
Simpler AKNN baselines often outperform complex algorithms in recall and latency.
Existing benchmarks overlook update efficiency, limiting real-world applicability.
The benchmark reveals new insights into AKNN performance in dynamic environments.
Abstract
Approximate K Nearest Neighbor (AKNN) algorithms play a pivotal role in various AI applications, including information retrieval, computer vision, and natural language processing. Although numerous AKNN algorithms and benchmarks have been developed recently to evaluate their effectiveness, the dynamic nature of real-world data presents significant challenges that existing benchmarks fail to address. Traditional benchmarks primarily assess retrieval effectiveness in static contexts and often overlook update efficiency, which is crucial for handling continuous data ingestion. This limitation results in an incomplete assessment of an AKNN algorithms ability to adapt to changing data patterns, thereby restricting insights into their performance in dynamic environments. To address these gaps, we introduce CANDY, a benchmark tailored for Continuous Approximate Nearest Neighbor Search with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Data Management and Algorithms · Machine Learning and Data Classification
