Shallow decision trees for explainable $k$-means clustering
Eduardo Laber, Lucas Murtinho, Felipe Oliveira

TL;DR
This paper introduces an efficient algorithm for constructing shallow, explainable decision trees for $k$-means clustering, improving interpretability while maintaining low clustering cost, and discusses computational hardness of optimal solutions.
Contribution
It proposes a new algorithm that considers tree depth for explainability in $k$-means clustering, outperforming previous methods in shallow tree construction.
Findings
Our algorithm achieves lower or comparable costs with shallower trees.
Decision-tree $k$-means clustering does not admit a polynomial-time $(1+ta)$-approximation unless P=NP.
Shallow decision trees enhance explainability without sacrificing clustering quality.
Abstract
A number of recent works have employed decision trees for the construction of explainable partitions that aim to minimize the -means cost function. These works, however, largely ignore metrics related to the depths of the leaves in the resulting tree, which is perhaps surprising considering how the explainability of a decision tree depends on these depths. To fill this gap in the literature, we propose an efficient algorithm that takes into account these metrics. In experiments on 16 datasets, our algorithm yields better results than decision-tree clustering algorithms such as the ones presented in \cite{dasgupta2020explainable}, \cite{frost2020exkmc}, \cite{laber2021price} and \cite{DBLP:conf/icml/MakarychevS21}, typically achieving lower or equivalent costs with considerably shallower trees. We also show, through a simple adaptation of existing techniques, that the problem of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRough Sets and Fuzzy Logic · Advanced Clustering Algorithms Research · Automated Road and Building Extraction
