Balancing the Tradeoff Between Clustering Value and Interpretability

Sandhya Saisubramanian; Sainyam Galhotra; Shlomo Zilberstein

arXiv:1912.07820·stat.ML·February 3, 2020

Balancing the Tradeoff Between Clustering Value and Interpretability

Sandhya Saisubramanian, Sainyam Galhotra, Shlomo Zilberstein

PDF

1 Repo

TL;DR

This paper introduces a tunable clustering algorithm that balances the tradeoff between cluster quality and interpretability by ensuring a fraction of nodes in each cluster share the same feature, with theoretical analysis and empirical validation.

Contribution

It proposes a novel $eta$-interpretable clustering algorithm that guarantees interpretability constraints and provides efficient solutions and theoretical guarantees for different scenarios.

Findings

01

The algorithms produce more interpretable clusters in real-world datasets.

02

The $eta$ parameter effectively controls the interpretability level.

03

The approach includes generating simple explanations for the clusters.

Abstract

Graph clustering groups entities -- the vertices of a graph -- based on their similarity, typically using a complex distance function over a large number of features. Successful integration of clustering approaches in automated decision-support systems hinges on the interpretability of the resulting clusters. This paper addresses the problem of generating interpretable clusters, given features of interest that signify interpretability to an end-user, by optimizing interpretability in addition to common clustering objectives. We propose a $β$ -interpretable clustering algorithm that ensures that at least $β$ fraction of nodes in each cluster share the same feature value. The tunable parameter $β$ is user-specified. We also present a more efficient algorithm for scenarios with $β = 1$ and analyze the theoretical guarantees of the two algorithms. Finally, we empirically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sandysa/Interpretable_Clustering
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsInterpretability