Combining Feature and Prototype Pruning by Uncertainty Minimization
Marc Sebban, Richard Nock

TL;DR
This paper introduces a joint feature and prototype pruning method for k-nearest neighbor classification that uses an uncertainty-based criterion, leading to more efficient dataset reduction and improved robustness against noise.
Contribution
It proposes a novel combined feature and prototype pruning approach based on uncertainty minimization, unlike traditional independent methods.
Findings
Improves dataset reduction efficiency in k-NN classification.
Reduces distance calculations during pruning.
Demonstrates robustness to noisy data.
Abstract
We focus in this paper on dataset reduction techniques for use in k-nearest neighbor classification. In such a context, feature and prototype selections have always been independently treated by the standard storage reduction algorithms. While this certifying is theoretically justified by the fact that each subproblem is NP-hard, we assume in this paper that a joint storage reduction is in fact more intuitive and can in practice provide better results than two independent processes. Moreover, it avoids a lot of distance calculations by progressively removing useless instances during the feature pruning. While standard selection algorithms often optimize the accuracy to discriminate the set of solutions, we use in this paper a criterion based on an uncertainty measure within a nearest-neighbor graph. This choice comes from recent results that have proven that accuracy is not always the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Data Mining Algorithms and Applications
