The Cascading Metric Tree
Jeffrey Uhlmann, Miguel R. Zuniga

TL;DR
The paper introduces the Cascaded Metric Tree (CMT), a novel data structure that enhances metric search query efficiency by utilizing cascaded distance information, demonstrating significant performance improvements on large datasets.
Contribution
It presents the Cascaded Metric Tree (CMT), a new metric search structure that exploits cascaded distance information for improved query performance, including a near-optimal kNN algorithm.
Findings
CMT outperforms classical metric search structures on large synthetic datasets.
CMT achieves significant speedups on the Swiss-Prot protein dataset.
Reference implementations are provided for practical use and further research.
Abstract
This paper presents the Cascaded Metric Tree (CMT) for efficient satisfaction of metric search queries over a dataset of N objects. It provides extra information that permits query algorithms to exploit all distance calculations performed along each path in the tree for pruning purposes. In addition to improving standard metric range (ball) query algorithms, we present a new algorithm for exploiting the CMT cascaded information to achieve near-optimal performance for k-nearest neighbor (kNN) queries. We demonstrate the performance advantage of CMT over classical metric search structures on synthetic datasets of up to 10 million objects and on the 564K Swiss-Prot protein sequence dataset containing over million amino acids. As a supplement to the paper, we provide reference implementations of the empirically-examined algorithms to encourage improvements and further applications of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Algorithms and Data Compression · Advanced Database Systems and Queries
