Reducing Nearest Neighbor Training Sets Optimally and Exactly
Josiah Rohrer, Simon Weber

TL;DR
This paper studies how to optimally and exactly reduce training sets in nearest-neighbor classification by identifying relevant points, providing algorithms and complexity results for different dimensions.
Contribution
It characterizes relevant points as minimal training sets in general position and establishes complexity results for finding minimal reductions across dimensions.
Findings
Relevant points form a minimal training set in general position.
Finding minimal reduced sets is polynomial-time for 1D.
Finding minimal reduced sets is NP-complete for dimensions 2 and higher.
Abstract
In nearest-neighbor classification, a training set of points in with given classification is used to classify every point in : Every point gets the same classification as its nearest neighbor in . Recently, Eppstein [SOSA'22] developed an algorithm to detect the relevant training points, those points , such that and induce different classifications. We investigate the problem of finding the minimum cardinality reduced training set such that and induce the same classification. We show that the set of relevant points is such a minimum cardinality reduced training set if is in general position. Furthermore, we show that finding a minimum cardinality reduced training set for possibly degenerate is in P for , and NP-complete for .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Image and Object Detection Techniques
