Faster Algorithms for the Constrained k-means Problem
Anup Bhattacharya, Ragesh Jaiswal, Amit Kumar

TL;DR
This paper introduces faster algorithms for a generalized k-means clustering problem where optimal clusters are arbitrary, providing bounds on the list size of centers needed for near-optimal solutions and improving computational efficiency.
Contribution
The paper presents a randomized algorithm with improved runtime and bounds for the list size of centers in a generalized k-means problem with arbitrary clusters.
Findings
Provides an upper bound of 2^{~O(k/ε)} on the list size of centers.
Establishes a lower bound of 2^{~Ω(k/√ε)} on the list size.
Algorithm runs in time O(n d 2^{~O(k/ε)}), improving previous results.
Abstract
The classical center based clustering problems such as -means/median/center assume that the optimal clusters satisfy the locality property that the points in the same cluster are close to each other. A number of clustering problems arise in machine learning where the optimal clusters do not follow such a locality property. Consider a variant of the -means problem that may be regarded as a general version of such problems. Here, the optimal clusters are an arbitrary partition of the dataset and the goal is to output -centers such that the objective function is minimized. It is not difficult to argue that any algorithm (without knowing the optimal clusters) that outputs a single set of centers, will not behave well as far as optimizing the above objective function is concerned. However, this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
