Improved Search of Relevant Points for Nearest-Neighbor Classification

Alejandro Flores-Velazco

arXiv:2203.03567·cs.CG·March 8, 2022

Improved Search of Relevant Points for Nearest-Neighbor Classification

Alejandro Flores-Velazco

PDF

Open Access

TL;DR

This paper presents an improved algorithm for efficiently identifying relevant border points in nearest-neighbor classification, reducing the computational complexity from quadratic to linear in the dataset size for certain steps.

Contribution

The authors improve an existing output-sensitive algorithm for border point detection by eliminating unnecessary initial steps, achieving better time complexity.

Findings

01

Reduced the algorithm's complexity from O(n^2 + nk^2) to O(nk^2).

02

Proved that initial quadratic steps are unnecessary for the algorithm.

03

Enhanced the efficiency of nearest-neighbor classifier preprocessing.

Abstract

Given a training set $P \subset R^{d}$ , the nearest-neighbor classifier assigns any query point $q \in R^{d}$ to the class of its closest point in $P$ . To answer these classification queries, some training points are more relevant than others. We say a training point is relevant if its omission from the training set could induce the misclassification of some query point in $R^{d}$ . These relevant points are commonly known as border points, as they define the boundaries of the Voronoi diagram of $P$ that separate points of different classes. Being able to compute this set of points efficiently is crucial to reduce the size of the training set without affecting the accuracy of the nearest-neighbor classifier. Improving over a decades-long result by Clarkson, in a recent paper by Eppstein an output-sensitive algorithm was proposed to find the set of border points of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification