Near-Optimal Explainable $k$-Means for All Dimensions
Moses Charikar, Lunjia Hu

TL;DR
This paper presents an efficient algorithm for explainable $k$-means clustering in all dimensions, achieving near-optimal cost bounds and significantly improving over previous methods, especially in two dimensions.
Contribution
The authors develop a new algorithm that guarantees near-optimal explainable $k$-means clustering costs across all dimensions, improving theoretical bounds and practical performance.
Findings
Achieves $k^{1 - 2/d} ext{poly}(d ext{log}k)$ approximation ratio.
Improves the previous bound to $k^{1 - 2/d} ext{polylog}(k)$, which is near-optimal.
For $d=2$, obtains an $O( ext{log}k ext{log} ext{log}k)$ bound, exponentially better than previous work.
Abstract
Many clustering algorithms are guided by certain cost functions such as the widely-used -means cost. These algorithms divide data points into clusters with often complicated boundaries, creating difficulties in explaining the clustering decision. In a recent work, Dasgupta, Frost, Moshkovitz, and Rashtchian (ICML 2020) introduced explainable clustering, where the cluster boundaries are axis-parallel hyperplanes and the clustering is obtained by applying a decision tree to the data. The central question here is: how much does the explainability constraint increase the value of the cost function? Given -dimensional data points, we show an efficient algorithm that finds an explainable clustering whose -means cost is at most times the minimum cost achievable by a clustering without the explainability constraint, assuming . Taking…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Stochastic Gradient Optimization Techniques · Bayesian Modeling and Causal Inference
