Multi-Swap $k$-Means++
Lorenzo Beretta, Vincent Cohen-Addad, Silvio Lattanzi, Nikos, Parotsidis

TL;DR
This paper introduces a generalized multi-swap local search algorithm for $k$-means clustering that achieves a near-optimal approximation ratio and demonstrates significant practical improvements over previous methods.
Contribution
It extends existing local search algorithms by allowing multiple simultaneous swaps, achieving a $9 + \\varepsilon$ approximation ratio, the best possible for local search, with practical performance gains.
Findings
Achieves a $9 + \\varepsilon$ approximation ratio for $k$-means clustering.
Demonstrates significant quality improvements over previous local search methods.
Shows practical effectiveness on several datasets.
Abstract
The -means++ algorithm of Arthur and Vassilvitskii (SODA 2007) is often the practitioners' choice algorithm for optimizing the popular -means clustering objective and is known to give an -approximation in expectation. To obtain higher quality solutions, Lattanzi and Sohler (ICML 2019) proposed augmenting -means++ with local search steps obtained through the -means++ sampling distribution to yield a -approximation to the -means clustering problem, where is a large absolute constant. Here we generalize and extend their local search algorithm by considering larger and more sophisticated local search neighborhoods hence allowing to swap multiple centers at the same time. Our algorithm achieves a approximation ratio, which is the best possible for local search. Importantly we show that our approach yields substantial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Advanced Statistical Methods and Models · Advanced Clustering Algorithms Research
