The Bane of Low-Dimensionality Clustering
Vincent Cohen-Addad, Arnaud de Mesmay, Eva Rotenberg, Alan, Roytman

TL;DR
This paper establishes strong computational lower bounds for k-median and k-means clustering problems in low-dimensional Euclidean spaces, showing they are computationally hard even in dimensions as low as three or four.
Contribution
It provides the first conditional lower bounds for these clustering problems in low dimensions, contrasting with other geometric problems that are easier in higher dimensions.
Findings
Lower bound of $n^{ ilde{ ext{Omega}}(k)}$ for k-median and k-means in 4D.
Lower bound of $n^{o( oot{ ext{sqrt}}{k})}$ in 2D for penalized clustering.
Matching upper bounds in 2D for penalized clustering.
Abstract
In this paper, we give a conditional lower bound of on running time for the classic k-median and k-means clustering objectives (where n is the size of the input), even in low-dimensional Euclidean space of dimension four, assuming the Exponential Time Hypothesis (ETH). We also consider k-median (and k-means) with penalties where each point need not be assigned to a center, in which case it must pay a penalty, and extend our lower bound to at least three-dimensional Euclidean space. This stands in stark contrast to many other geometric problems such as the traveling salesman problem, or computing an independent set of unit spheres. While these problems benefit from the so-called (limited) blessing of dimensionality, as they can be solved in time or in d dimensions, our work shows that widely-used clustering objectives have a lower…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
