On the price of explainability for some clustering problems

Eduardo Laber; Lucas Murtinho

arXiv:2101.01576·cs.LG·February 16, 2021·1 cites

On the price of explainability for some clustering problems

Eduardo Laber, Lucas Murtinho

PDF

Open Access 1 Video

TL;DR

This paper investigates the trade-off between explainability and optimality in clustering problems, providing bounds, algorithms, and empirical results for decision-tree based explainable clustering.

Contribution

It offers improved bounds for explainability costs in clustering and introduces an efficient algorithm with better empirical performance.

Findings

01

Upper and lower bounds for explainability price in clustering

02

An efficient algorithm for explainable k-means clustering

03

Empirical evidence of improved performance over existing methods

Abstract

The price of explainability for a clustering task can be defined as the unavoidable loss,in terms of the objective function, if we force the final partition to be explainable. Here, we study this price for the following clustering problems: $k$ -means, $k$ -medians, $k$ -centers and maximum-spacing. We provide upper and lower bounds for a natural model where explainability is achieved via decision trees. For the $k$ -means and $k$ -medians problems our upper bounds improve those obtained by [Moshkovitz et. al, ICML 20] for low dimensions. Another contribution is a simple and efficient algorithm for building explainable clusterings for the $k$ -means problem. We provide empirical evidence that its performance is better than the current state of the art for decision-tree based explainable clustering.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

On the price of explainability for some clustering problems· slideslive

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Imbalanced Data Classification Techniques · Data Mining Algorithms and Applications