Computing $k$-means in mixed precision
Erin Carson, Xinye Chen, Xiaobo Liu

TL;DR
This paper explores the use of mixed precision arithmetic in the k-means clustering algorithm, demonstrating potential speedups and robustness in various data scenarios through extensive simulations.
Contribution
It introduces a mixed-precision framework for k-means, analyzing its numerical stability and effectiveness, especially for normalized data, and provides practical insights for hardware acceleration.
Findings
Normalized data tolerates reduced precision well.
Unnormalized data requires careful handling to avoid overflow.
Mixed precision can accelerate k-means without sacrificing accuracy.
Abstract
The k-means algorithm is one of the most popular and critical techniques in data mining and machine learning, and it has achieved significant success in numerous science and engineering domains. Computing k-means to a global optimum is NP-hard in Euclidean space, yet there are a variety of efficient heuristic algorithms, such as Lloyd's algorithm, that converge to a local optimum with superpolynomial complexity in the worst case. Motivated by the emergence and prominence of mixed precision capabilities in hardware, a current trend is to develop low and mixed precision variants of algorithms in order to improve the runtime and energy consumption. In this paper we study the numerical stability of Lloyd's k-means algorithm, and, in particular, we confirm the stability of the widely used distance computation formula. We propose a mixed-precision framework for k-means computation and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
