Distributional Clustering: A distribution-preserving clustering method
Arvind Krishna, Simon Mak, Roshan Joseph

TL;DR
This paper introduces a novel clustering method called distributional clustering that ensures cluster centers accurately reflect the data distribution, addressing limitations of traditional k-means in distribution preservation.
Contribution
The paper proposes a new distributional clustering method with proven asymptotic convergence and an efficient algorithm, improving data distribution representation in clustering.
Findings
Distributional clustering converges asymptotically to the data distribution.
The method effectively preserves data distribution in synthetic and real datasets.
Compared to k-means, it provides more representative cluster centers.
Abstract
One key use of k-means clustering is to identify cluster prototypes which can serve as representative points for a dataset. However, a drawback of using k-means cluster centers as representative points is that such points distort the distribution of the underlying data. This can be highly disadvantageous in problems where the representative points are subsequently used to gain insights on the data distribution, as these points do not mimic the distribution of the data. To this end, we propose a new clustering method called "distributional clustering", which ensures cluster centers capture the distribution of the underlying data. We first prove the asymptotic convergence of the proposed cluster centers to the data generating distribution, then present an efficient algorithm for computing these cluster centers in practice. Finally, we demonstrate the effectiveness of distributional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models · Face and Expression Recognition
Methodsk-Means Clustering
