Parameterized Complexity of Categorical Clustering with Size Constraints
Fedor V. Fomin, Petr A. Golovach, and Nidhi Purohit

TL;DR
This paper extends fixed-parameter tractable algorithms for categorical clustering with Hamming distance to include size constraints on clusters, broadening applicability to more complex clustering scenarios.
Contribution
It introduces an algorithm for capacitated categorical clustering, incorporating cluster size constraints, building on prior work for the binary case.
Findings
Algorithm solves capacitated clustering in 2^{O(B log B)} ||^B (mn)^{O(1)} time.
Extends previous algorithms to more general clustering models.
Enables solutions for various constrained clustering variants.
Abstract
In the Categorical Clustering problem, we are given a set of vectors (matrix) A={a_1,\ldots,a_n} over \Sigma^m, where \Sigma is a finite alphabet, and integers k and B. The task is to partition A into k clusters such that the median objective of the clustering in the Hamming norm is at most B. That is, we seek a partition {I_1,\ldots,I_k} of {1,\ldots,n} and vectors c_1,\ldots,c_k\in\Sigma^m such that \sum_{i=1}^k\sum_{j\in I_i}d_h(c_i,a_j)\leq B, where d_H(a,b) is the Hamming distance between vectors a and b. Fomin, Golovach, and Panolan [ICALP 2018] proved that the problem is fixed-parameter tractable (for binary case \Sigma={0,1}) by giving an algorithm that solves the problem in time 2^{O(B\log B)} (mn)^{O(1)}. We extend this algorithmic result to a popular capacitated clustering model, where in addition the sizes of the clusters should satisfy certain constraints. More precisely,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Random Matrices and Applications · Multi-Criteria Decision Making
