Counterexamples expose gaps in the proof of time complexity for cover trees introduced in 2006
Yury Elkin, Vitaliy Kurlin

TL;DR
This paper presents counterexamples that disprove previous proofs of near-linear time complexity for cover trees in k-nearest neighbor searches, highlighting gaps in earlier analyses and proposing a new approach for correction.
Contribution
It constructs explicit datasets that challenge prior proofs of cover tree time complexity and introduces a new compressed cover tree for corrected analysis.
Findings
Counterexamples invalidate previous proofs of near-linear time complexity.
Original proof methods missed potential nodes in cover trees during neighbor search.
A new compressed cover tree approach is proposed for accurate complexity analysis.
Abstract
This paper is motivated by the k-nearest neighbors search: given an arbitrary metric space, and its finite subsets (a reference set R and a query set Q), design a fast algorithm to find all k-nearest neighbors in R for every point q in Q. In 2006, Beygelzimer, Kakade, and Langford introduced cover trees to justify a near-linear time complexity for the neighbor search in the sizes of Q,R. Section 5.3 of Curtin's PhD (2015) pointed out that the proof of this result was wrong. The key step in the original proof attempted to show that the number of iterations can be estimated by multiplying the length of the longest root-to-leaf path in a cover tree by a constant factor. However, this estimate can miss many potential nodes in several branches of a cover tree, that should be considered during the neighbor search. The same argument was unfortunately repeated in several subsequent papers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Algorithms and Data Compression · Advanced Image and Video Retrieval Techniques
