Linear-Time Approximation Scheme for k-Means Clustering of Affine Subspaces
Kyungjin Cho, Eunjin Oh

TL;DR
This paper introduces a linear-time approximation scheme for k-means clustering of incomplete data represented as affine subspaces, significantly improving previous algorithms' efficiency.
Contribution
The paper presents a novel linear-time algorithm for k-means clustering of affine subspaces, reducing complexity from quadratic to linear in data size.
Findings
Achieves (1+ε)-approximate solutions in O(nd) time.
Constants depend only on Δ, ε, and k.
Improves previous O(n^2 d) algorithm by a factor of n.
Abstract
In this paper, we present a linear-time approximation scheme for -means clustering of \emph{incomplete} data points in -dimensional Euclidean space. An \emph{incomplete} data point with unspecified entries is represented as an axis-parallel affine subspaces of dimension . The distance between two incomplete data points is defined as the Euclidean distance between two closest points in the axis-parallel affine subspaces corresponding to the data points. We present an algorithm for -means clustering of axis-parallel affine subspaces of dimension that yields an -approximate solution in time. The constants hidden behind depend only on and . This improves the -time algorithm by Eiben et al.[SODA'21] by a factor of .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
