Consistency of Lloyd's Algorithm Under Perturbations

Dhruv Patel; Hui Shen; Shankar Bhamidi; Yufeng Liu; Vladas Pipiras

arXiv:2309.00578·cs.LG·April 28, 2026

Consistency of Lloyd's Algorithm Under Perturbations

Dhruv Patel, Hui Shen, Shankar Bhamidi, Yufeng Liu, Vladas Pipiras

PDF

TL;DR

This paper proves that Lloyd's algorithm maintains an exponentially bounded mis-clustering rate under small perturbations, extending previous results to more realistic data scenarios involving pre-processing steps.

Contribution

It demonstrates that Lloyd's algorithm remains effective under data perturbations when combined with proper initialization, with implications for various clustering applications.

Findings

01

Mis-clustering rate remains exponentially bounded under small perturbations.

02

Proper initialization ensures the correctness of Lloyd's algorithm in perturbed settings.

03

Results apply to high-dimensional data, time series, and network community detection.

Abstract

In the context of unsupervised learning, Lloyd's algorithm is one of the most widely used clustering algorithms. It has inspired a plethora of work investigating the correctness of the algorithm under various settings with ground truth clusters. In particular, in 2016, Lu and Zhou have shown that the mis-clustering rate of Lloyd's algorithm on $n$ independent samples from a sub-Gaussian mixture is exponentially bounded after $O (lo g (n))$ iterations, assuming proper initialization of the algorithm. However, in many applications, the true samples are unobserved and need to be learned from the data via pre-processing pipelines such as spectral methods on appropriate data matrices. We show that the mis-clustering rate of Lloyd's algorithm on perturbed samples from a sub-Gaussian mixture is also exponentially bounded after $O (lo g (n))$ iterations under the assumptions of proper…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.