Distance and Similarity Measures Effect on the Performance of K-Nearest Neighbor Classifier -- A Review
V. B. Surya Prasath, Haneen Arafat Abu Alfeilat, Ahmad B. A. Hassanat,, Omar Lasassmeh, Ahmad S. Tarawneh, Mahmoud Bashir Alhasanat, Hamzeh S. Eyal, Salman

TL;DR
This review evaluates how different distance and similarity measures affect KNN classifier performance across various datasets and noise levels, highlighting the superior robustness of a recent non-convex distance.
Contribution
It systematically compares numerous distance measures for KNN, identifying the most effective and noise-tolerant distances through extensive experiments.
Findings
Non-convex distance performed best across datasets.
KNN performance varies significantly with distance measure.
Top distances tolerate up to 90% noise with only 20% performance degradation.
Abstract
The K-nearest neighbor (KNN) classifier is one of the simplest and most common classifiers, yet its performance competes with the most complex classifiers in the literature. The core of this classifier depends mainly on measuring the distance or similarity between the tested examples and the training examples. This raises a major question about which distance measures to be used for the KNN classifier among a large number of distance and similarity measures available? This review attempts to answer this question through evaluating the performance (measured by accuracy, precision and recall) of the KNN using a large number of distance measures, tested on a number of real-world datasets, with and without adding different levels of noise. The experimental results show that the performance of KNN classifier depends significantly on the distance used, and the results showed large gaps…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Face and Expression Recognition
