Random Cuts are Optimal for Explainable k-Medians
Konstantin Makarychev, Liren Shan

TL;DR
This paper proves that the RandomCoordinateCut algorithm is optimally competitive for explainable k-medians in l1, matching the known lower bounds and providing a tight analysis of its performance.
Contribution
The paper provides a tight analysis of the RandomCoordinateCut algorithm, establishing its optimality for explainable k-medians in l1 and matching the lower bounds.
Findings
The competitive ratio of the algorithm is upper bounded by 2ln k + 2.
The bound matches the Omega(log k) lower bound, proving optimality.
The analysis confirms the algorithm's effectiveness for explainable k-medians.
Abstract
We show that the RandomCoordinateCut algorithm gives the optimal competitive ratio for explainable k-medians in l1. The problem of explainable k-medians was introduced by Dasgupta, Frost, Moshkovitz, and Rashtchian in 2020. Several groups of authors independently proposed a simple polynomial-time randomized algorithm for the problem and showed that this algorithm is O(log k loglog k) competitive. We provide a tight analysis of the algorithm and prove that its competitive ratio is upper bounded by 2ln k +2. This bound matches the Omega(log k) lower bound by Dasgupta et al (2020).
Peer Reviews
Decision·NeurIPS 2023 oral
- This paper resolves the question of whether the Random Coordinate Cut algorithm is optimal via a tight analysis. - The analysis of the Set Elimination Game using Poisson processes is creative and novel (based on my understanding from the related work section, other papers implicitly analyze this game, but the method here is new). It is also quite clean. - The formulation of the Set Elimination Game and the main result on its expected cost (Theorem 2.1) may be of independent interest.
The result is slightly weaker (by a factor of 2) than a concurrent result of Gupta, Pittu, Svensson, and Yuan (2023), which gives a bound of 1+H_{k-1}.
Videos
Taxonomy
TopicsComplexity and Algorithms in Graphs · Privacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques
