Constant factor approximations for Lower and Upper bounded Clusterings
Neelima Gupta, Sapna Grover, Rajni Dabas

TL;DR
This paper introduces a framework for approximating clustering problems with specified lower and upper bounds on cluster sizes, achieving constant factor approximations with controlled violations, thus aiding in privacy and analysis constraints.
Contribution
It provides the first constant factor approximation algorithms for lower and upper bounded clustering problems, including k-median and facility location, with bounded violations on upper bounds.
Findings
Achieves constant factor approximation for LUkM and LUkFL with bounded violations.
Improves upper bound violation for LUFL compared to previous work.
Provides approximation algorithms for LUkC and LUkS with uniform bounds.
Abstract
Clustering is one of the most fundamental problem in Machine Learning. Researchers in the field often require a lower bound on the size of the clusters to maintain anonymity and upper bound for the ease of analysis. Specifying an optimal cluster size is a problem often faced by scientists. In this paper, we present a framework to obtain constant factor approximations for some prominent clustering objectives, with lower and upper bounds on cluster size. This enables scientists to give an approximate cluster size by specifying the lower and the upper bounds for it. Our results preserve the lower bounds but may violate the upper bound a little. %{GroverGD21_LBUBFL_Cocoon} to . %namely, Center (LUkC) and Median (LUkM) problem. We study the problems when either of the bounds is uniform. We apply our framework to give the first constant factor approximations for LUkM and its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFacility Location and Emergency Management · Privacy-Preserving Technologies in Data · Complexity and Algorithms in Graphs
