Effect of Different Distance Measures on the Performance of K-Means Algorithm: An Experimental Study in Matlab
Mr. Dibya Jyoti Bora, Dr. Anil Kumar Gupta

TL;DR
This paper experimentally investigates how different distance measures affect the performance of the K-Means clustering algorithm on iris and wine datasets using Matlab, highlighting the importance of selecting appropriate distance metrics.
Contribution
It provides an empirical comparison of various distance measures in K-Means clustering, emphasizing their impact on clustering performance based on dataset characteristics.
Findings
Performance varies significantly with different distance measures.
Certain distance measures yield better clustering results for specific datasets.
The choice of distance measure is crucial for optimal K-Means performance.
Abstract
K-means algorithm is a very popular clustering algorithm which is famous for its simplicity. Distance measure plays a very important rule on the performance of this algorithm. We have different distance measure techniques available. But choosing a proper technique for distance calculation is totally dependent on the type of the data that we are going to cluster. In this paper an experimental study is done in Matlab to cluster the iris and wine data sets with different distance measures and thereby observing the variation of the performances shown.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Advanced Data Compression Techniques · Face and Expression Recognition
