An effective variant of the Hartigan $k$-means algorithm
Fran\c{c}ois Cl\'ement, Stefan Steinerberger

TL;DR
This paper introduces a minor variation of Hartigan's k-means algorithm that consistently improves clustering results by an additional 2-5%, especially in higher dimensions or with larger k.
Contribution
It proposes a simple modification to Hartigan's algorithm that yields further improvements over existing methods in clustering quality.
Findings
Hartigan's algorithm outperforms Lloyd's in most cases.
A minor variation of Hartigan's method improves results by 2-5%.
Improvements are more significant with higher dimensions or larger k.
Abstract
The k-means problem is perhaps the classical clustering problem and often synonymous with Lloyd's algorithm (1957). It has become clear that Hartigan's algorithm (1975) gives better results in almost all cases, Telgarsky-Vattani note a typical improvement of -- . We point out that a very minor variation of Hartigan's method leads to another -- improvement; the improvement tends to become larger when either dimension or increase.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
