Streaming PTAS for Constrained k-Means
Dishant Goyal, Ragesh Jaiswal, Amit Kumar

TL;DR
This paper introduces a streaming PTAS for constrained k-means clustering, leveraging list-based solutions and stability assumptions to improve efficiency and applicability in large-scale and constrained scenarios.
Contribution
It generalizes list-k-means algorithms to streaming settings and enhances PTAS efficiency under stability assumptions, enabling practical constrained k-means clustering.
Findings
Developed a 2-pass streaming algorithm for list-k-means.
Converted the streaming algorithm into a 4-pass logspace PTAS for constrained k-means.
Significantly improved the running time of stable k-means algorithms under certain conditions.
Abstract
We generalise the results of Bhattacharya et al. (Journal of Computing Systems, 62(1):93-115, 2018) for the list--means problem defined as -- for a (unknown) partition of the dataset , find a list of -center sets (each element in the list is a set of centers) such that at least one of -center sets in the list gives an -approximation with respect to the cost function . The list--means problem is important for the constrained -means problem since algorithms for the former can be converted to PTAS for various versions of the latter. Following are the consequences of our generalisations: - Streaming algorithm: Our -sampling based algorithm running in a single iteration allows us to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
