Reclassification formula that provides to surpass K-means method

M. Kharinov

arXiv:1209.6204·cs.CV·September 28, 2012·5 cites

Reclassification formula that provides to surpass K-means method

M. Kharinov

PDF

Open Access

TL;DR

This paper introduces a reclassification formula for multidimensional data that improves clustering stability and efficiency over K-means by focusing on stable partitions with minimal total squared error.

Contribution

The paper proposes a novel reclassification formula that enables the calculation of optimal, stable data partitions, surpassing K-means in efficiency and stability.

Findings

01

The formula describes the change in total squared error due to reclassification.

02

The method produces stable partitions with minimal total squared error.

03

It is more efficient than traditional K-means clustering.

Abstract

The paper presents a formula for the reclassification of multidimensional data points (columns of real numbers, "objects", "vectors", etc.). This formula describes the change in the total squared error caused by reclassification of data points from one cluster into another and prompts the way to calculate the sequence of optimal partitions, which are characterized by a minimum value of the total squared error E (weighted sum of within-class variance, within-cluster sum of squares WCSS etc.), i.e. the sum of squared distances from each data point to its cluster center. At that source data points are treated with repetitions allowed, and resulting clusters from different partitions, in general case, overlap each other. The final partitions are characterized by "equilibrium" stability with respect to the reclassification of the data points, where the term "stability" means that any…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Rough Sets and Fuzzy Logic · Advanced Clustering Algorithms Research