Estimating the Mean Number of K-Means Clusters to Form

Robert A. Murphy

arXiv:1503.03488·cs.LG·February 12, 2016

Estimating the Mean Number of K-Means Clusters to Form

Robert A. Murphy

PDF

Open Access

TL;DR

This paper introduces a method to estimate the average number of clusters produced by K-Means clustering based on the dataset's sample size using a random cluster model.

Contribution

It presents a novel approach to predict the mean number of clusters in K-Means using a theoretical model based on sample size.

Findings

01

Provides a formula for estimating the mean number of clusters

02

Validates the model with empirical data

03

Offers insights into cluster formation dynamics

Abstract

Utilizing the sample size of a dataset, the random cluster model is employed in order to derive an estimate of the mean number of K-Means clusters to form during classification of a dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Advanced Clustering Algorithms Research · Machine Learning and Data Classification