A new distance measurement and its application in K-Means Algorithm
Yiqun Zhang, Houbiao Li

TL;DR
This paper introduces a novel view-distance metric for K-Means clustering, improving its ability to capture data structure and enhance clustering and classification accuracy on both synthetic and real-world datasets.
Contribution
The paper proposes a new view-distance measure for K-Means that better captures dataset structure, especially in high-dimensional spaces, and demonstrates its effectiveness through experiments.
Findings
Improved clustering boundaries on manifold datasets
Enhanced classification accuracy on real-world data
Better data structure representation with view-distance
Abstract
K-Means clustering algorithm is one of the most commonly used clustering algorithms because of its simplicity and efficiency. K-Means clustering algorithm based on Euclidean distance only pays attention to the linear distance between samples, but ignores the overall distribution structure of the dataset (i.e. the fluid structure of dataset). Since it is difficult to describe the internal structure of two data points by Euclidean distance in high-dimensional data space, we propose a new distance measurement, namely, view-distance, and apply it to the K-Means algorithm. On the classical manifold learning datasets, S-curve and Swiss roll datasets, not only this new distance can cluster the data according to the structure of the data itself, but also the boundaries between categories are neat dividing lines. Moreover, we also tested the classification accuracy and clustering effect of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Advanced Computing and Algorithms
Methodsk-Means Clustering
