How to Find a Good Explanation for Clustering?
Sayan Bandyapadhyay, Fedor V. Fomin, Petr A. Golovach, William Lochet,, Nidhi Purohit, Kirill Simonov

TL;DR
This paper explores algorithmic approaches to explainable clustering using decision trees, focusing on finding optimal explanations and minimizing clustering objectives, with a new outlier-based model and complexity analysis.
Contribution
Introduces a new outlier-based model for explainable clustering and analyzes the computational complexity of finding optimal explanations and clusterings.
Findings
New model inspired by outliers for explainable clustering
Algorithmic insights into complexity based on data parameters
Analysis of how parameters affect computational difficulty
Abstract
-means and -median clustering are powerful unsupervised machine learning techniques. However, due to complicated dependences on all the features, it is challenging to interpret the resulting cluster assignments. Moshkovitz, Dasgupta, Rashtchian, and Frost [ICML 2020] proposed an elegant model of explainable -means and -median clustering. In this model, a decision tree with leaves provides a straightforward characterization of the data set into clusters. We study two natural algorithmic questions about explainable clustering. (1) For a given clustering, how to find the "best explanation" by using a decision tree with leaves? (2) For a given set of points, how to find a decision tree with leaves minimizing the -means/median objective of the resulting explainable clustering? To address the first question, we introduce a new model of explainable clustering. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Anomaly Detection Techniques and Applications · Advanced Statistical Methods and Models
