How to Find a Good Explanation for Clustering?

Sayan Bandyapadhyay; Fedor V. Fomin; Petr A. Golovach; William Lochet,; Nidhi Purohit; Kirill Simonov

arXiv:2112.06580·cs.DS·December 17, 2021·1 cites

How to Find a Good Explanation for Clustering?

Sayan Bandyapadhyay, Fedor V. Fomin, Petr A. Golovach, William Lochet,, Nidhi Purohit, Kirill Simonov

PDF

Open Access 1 Video

TL;DR

This paper explores algorithmic approaches to explainable clustering using decision trees, focusing on finding optimal explanations and minimizing clustering objectives, with a new outlier-based model and complexity analysis.

Contribution

Introduces a new outlier-based model for explainable clustering and analyzes the computational complexity of finding optimal explanations and clusterings.

Findings

01

New model inspired by outliers for explainable clustering

02

Algorithmic insights into complexity based on data parameters

03

Analysis of how parameters affect computational difficulty

Abstract

$k$ -means and $k$ -median clustering are powerful unsupervised machine learning techniques. However, due to complicated dependences on all the features, it is challenging to interpret the resulting cluster assignments. Moshkovitz, Dasgupta, Rashtchian, and Frost [ICML 2020] proposed an elegant model of explainable $k$ -means and $k$ -median clustering. In this model, a decision tree with $k$ leaves provides a straightforward characterization of the data set into clusters. We study two natural algorithmic questions about explainable clustering. (1) For a given clustering, how to find the "best explanation" by using a decision tree with $k$ leaves? (2) For a given set of points, how to find a decision tree with $k$ leaves minimizing the $k$ -means/median objective of the resulting explainable clustering? To address the first question, we introduce a new model of explainable clustering. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

How to Find a Good Explanation for Clustering?· underline

Taxonomy

TopicsAdvanced Clustering Algorithms Research · Anomaly Detection Techniques and Applications · Advanced Statistical Methods and Models