Solving clustering as ill-posed problem: experiments with K-Means   algorithm

Alberto Arturo Vergani

arXiv:2211.08302·math.NA·November 16, 2022

Solving clustering as ill-posed problem: experiments with K-Means algorithm

Alberto Arturo Vergani

PDF

Open Access

TL;DR

This paper investigates K-Means clustering as an ill-posed inverse problem, exploring PCA-based feature reduction and testing a theorem linking cluster count to PCA components using neuroscientific fMRI data.

Contribution

It introduces a novel perspective of viewing K-Means as an ill-posed problem and evaluates PCA feature selection methods in this context.

Findings

01

Wishart criteria improve PCA feature selection for clustering

02

Theorem linking cluster count and PCA components is validated

03

PCA reduction with Wishart criteria yields low matrix condition number

Abstract

In this contribution, the clustering procedure based on K-Means algorithm is studied as an inverse problem, which is a special case of the illposed problems. The attempts to improve the quality of the clustering inverse problem drive to reduce the input data via Principal Component Analysis (PCA). Since there exists a theorem by Ding and He that links the cardinality of the optimal clusters found with K-Means and the cardinality of the selected informative PCA components, the computational experiments tested the theorem between two quantitative features selection methods: Kaiser criteria (based on imperative decision) versus Wishart criteria (based on random matrix theory). The results suggested that PCA reduction with features selection by Wishart criteria leads to a low matrix condition number and satisfies the relation between clusters and components predicts by the theorem. The data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical and numerical algorithms · Face and Expression Recognition · Neural Networks and Applications

MethodsPrincipal Components Analysis