Towards Robustness: A Critique of Current Vector Database Assessments

Zikai Wang; Qianxi Zhang; Baotong Lu; Qi Chen; Cheng Tan

arXiv:2507.00379·cs.DB·April 3, 2026

Towards Robustness: A Critique of Current Vector Database Assessments

Zikai Wang, Qianxi Zhang, Baotong Lu, Qi Chen, Cheng Tan

PDF

TL;DR

This paper critiques the reliance on average recall for evaluating vector databases, introducing a new robustness metric that better captures performance variability across queries.

Contribution

It proposes Robustness-$ delta$@K, a novel metric for assessing vector database robustness, and demonstrates its effectiveness in benchmarking and guiding improvements.

Findings

01

Robustness-$ delta$@K reveals significant differences in index robustness.

02

More robust indexes improve downstream application performance.

03

Design factors influencing robustness are identified and analyzed.

Abstract

Vector databases are critical infrastructure in AI systems, and average recall is the dominant metric for their evaluation. Both users and researchers rely on it to choose and optimize their systems. We show that relying on average recall is problematic. It hides variability across queries, allowing systems with strong mean performance to underperform significantly on hard queries. These tail cases confuse users and can lead to failure in downstream applications such as RAG. We argue that robustness consistently achieving acceptable recall across queries is crucial to vector database evaluation. We propose Robustness- $δ$ @K, a new metric that captures the fraction of queries with recall above a threshold $δ$ . This metric offers a deeper view of recall distribution, helps vector index selection regarding application needs, and guides the optimization of tail performance. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.