Computing $k$-means in mixed precision

Erin Carson; Xinye Chen; Xiaobo Liu

arXiv:2407.12208·math.NA·April 21, 2026

Computing $k$-means in mixed precision

Erin Carson, Xinye Chen, Xiaobo Liu

PDF

TL;DR

This paper explores the use of mixed precision arithmetic in the k-means clustering algorithm, demonstrating potential speedups and robustness in various data scenarios through extensive simulations.

Contribution

It introduces a mixed-precision framework for k-means, analyzing its numerical stability and effectiveness, especially for normalized data, and provides practical insights for hardware acceleration.

Findings

01

Normalized data tolerates reduced precision well.

02

Unnormalized data requires careful handling to avoid overflow.

03

Mixed precision can accelerate k-means without sacrificing accuracy.

Abstract

The k-means algorithm is one of the most popular and critical techniques in data mining and machine learning, and it has achieved significant success in numerous science and engineering domains. Computing k-means to a global optimum is NP-hard in Euclidean space, yet there are a variety of efficient heuristic algorithms, such as Lloyd's algorithm, that converge to a local optimum with superpolynomial complexity in the worst case. Motivated by the emergence and prominence of mixed precision capabilities in hardware, a current trend is to develop low and mixed precision variants of algorithms in order to improve the runtime and energy consumption. In this paper we study the numerical stability of Lloyd's k-means algorithm, and, in particular, we confirm the stability of the widely used distance computation formula. We propose a mixed-precision framework for k-means computation and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.