Reducing Nearest Neighbor Training Sets Optimally and Exactly

Josiah Rohrer; Simon Weber

arXiv:2302.02132·cs.CG·February 7, 2023

Reducing Nearest Neighbor Training Sets Optimally and Exactly

Josiah Rohrer, Simon Weber

PDF

Open Access

TL;DR

This paper studies how to optimally and exactly reduce training sets in nearest-neighbor classification by identifying relevant points, providing algorithms and complexity results for different dimensions.

Contribution

It characterizes relevant points as minimal training sets in general position and establishes complexity results for finding minimal reductions across dimensions.

Findings

01

Relevant points form a minimal training set in general position.

02

Finding minimal reduced sets is polynomial-time for 1D.

03

Finding minimal reduced sets is NP-complete for dimensions 2 and higher.

Abstract

In nearest-neighbor classification, a training set $P$ of points in $R^{d}$ with given classification is used to classify every point in $R^{d}$ : Every point gets the same classification as its nearest neighbor in $P$ . Recently, Eppstein [SOSA'22] developed an algorithm to detect the relevant training points, those points $p \in P$ , such that $P$ and $P ∖ {p}$ induce different classifications. We investigate the problem of finding the minimum cardinality reduced training set $P^{'} \subseteq P$ such that $P$ and $P^{'}$ induce the same classification. We show that the set of relevant points is such a minimum cardinality reduced training set if $P$ is in general position. Furthermore, we show that finding a minimum cardinality reduced training set for possibly degenerate $P$ is in P for $d = 1$ , and NP-complete for $d \geq 2$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Image and Object Detection Techniques