A survey on feature weighting based K-Means algorithms

Renato Cordeiro de Amorim

arXiv:1601.03483·cs.LG·January 15, 2016

A survey on feature weighting based K-Means algorithms

Renato Cordeiro de Amorim

PDF

TL;DR

This survey reviews feature weighting methods in K-Means clustering, analyzing their effectiveness, common issues, and future research directions based on empirical evidence and algorithmic analysis.

Contribution

It provides a comprehensive analysis of existing feature weighting K-Means algorithms, highlighting their strengths, weaknesses, and potential avenues for improvement.

Findings

01

Identifies key strengths and limitations of current feature weighting methods

02

Provides empirical comparison of cluster recovery ability

03

Suggests future research directions in feature weighting algorithms

Abstract

In a real-world data set there is always the possibility, rather high in our opinion, that different features may have different degrees of relevance. Most machine learning algorithms deal with this fact by either selecting or deselecting features in the data preprocessing phase. However, we maintain that even among relevant features there may be different degrees of relevance, and this should be taken into account during the clustering process. With over 50 years of history, K-Means is arguably the most popular partitional clustering algorithm there is. The first K-Means based clustering algorithm to compute feature weights was designed just over 30 years ago. Various such algorithms have been designed since but there has not been, to our knowledge, a survey integrating empirical evidence of cluster recovery ability, common flaws, and possible directions for future research. This paper…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.