A new distance measurement and its application in K-Means Algorithm

Yiqun Zhang; Houbiao Li

arXiv:2206.05215·cs.LG·June 13, 2022·1 cites

A new distance measurement and its application in K-Means Algorithm

Yiqun Zhang, Houbiao Li

PDF

Open Access

TL;DR

This paper introduces a novel view-distance metric for K-Means clustering, improving its ability to capture data structure and enhance clustering and classification accuracy on both synthetic and real-world datasets.

Contribution

The paper proposes a new view-distance measure for K-Means that better captures dataset structure, especially in high-dimensional spaces, and demonstrates its effectiveness through experiments.

Findings

01

Improved clustering boundaries on manifold datasets

02

Enhanced classification accuracy on real-world data

03

Better data structure representation with view-distance

Abstract

K-Means clustering algorithm is one of the most commonly used clustering algorithms because of its simplicity and efficiency. K-Means clustering algorithm based on Euclidean distance only pays attention to the linear distance between samples, but ignores the overall distribution structure of the dataset (i.e. the fluid structure of dataset). Since it is difficult to describe the internal structure of two data points by Euclidean distance in high-dimensional data space, we propose a new distance measurement, namely, view-distance, and apply it to the K-Means algorithm. On the classical manifold learning datasets, S-curve and Swiss roll datasets, not only this new distance can cluster the data according to the structure of the data itself, but also the boundaries between categories are neat dividing lines. Moreover, we also tested the classification accuracy and clustering effect of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research · Advanced Computing and Algorithms

Methodsk-Means Clustering