Using MM principles to deal with incomplete data in K-means clustering
Ali Beikmohammadi

TL;DR
This paper proposes a method based on Majorization-Minimization principles to effectively handle incomplete data in K-means clustering, restoring data symmetry to improve clustering performance.
Contribution
It introduces a novel MM-based approach for imputing missing data in K-means, enabling the algorithm to work effectively with incomplete datasets.
Findings
The proposed method improves clustering accuracy on standard datasets.
The algorithm effectively restores data symmetry with missing attributes.
Source code and pseudo-code are provided for reproducibility.
Abstract
Among many clustering algorithms, the K-means clustering algorithm is widely used because of its simple algorithm and fast convergence. However, this algorithm suffers from incomplete data, where some samples have missed some of their attributes. To solve this problem, we mainly apply MM principles to restore the symmetry of the data, so that K-means could work well. We give the pseudo-code of the algorithm and use the standard datasets for experimental verification. The source code for the experiments is publicly available in the following link: \url{https://github.com/AliBeikmohammadi/MM-Optimization/blob/main/mini-project/MM%20K-means.ipynb}.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Metaheuristic Optimization Algorithms Research · Face and Expression Recognition
Methodsk-Means Clustering
