Interpretable Deep Clustering for Tabular Data

Jonathan Svirsky; Ofir Lindenbaum

arXiv:2306.04785·cs.LG·June 11, 2024·1 cites

Interpretable Deep Clustering for Tabular Data

Jonathan Svirsky, Ofir Lindenbaum

PDF

Open Access 1 Repo

TL;DR

This paper introduces a deep learning framework for interpretable clustering of tabular data, identifying key features for each sample and cluster, and demonstrating reliable, interpretable results across various domains.

Contribution

It presents a novel self-supervised feature selection method and a model that predicts interpretable cluster assignments with feature importance at both sample and cluster levels.

Findings

01

Reliable clustering in biological, text, image, and physics datasets.

02

Model provides interpretable feature importance for samples and clusters.

03

Code available for reproducibility.

Abstract

Clustering is a fundamental learning task widely used as a first step in data analysis. For example, biologists use cluster assignments to analyze genome sequences, medical records, or images. Since downstream analysis is typically performed at the cluster level, practitioners seek reliable and interpretable clustering models. We propose a new deep-learning framework for general domain tabular data that predicts interpretable cluster assignments at the instance and cluster levels. First, we present a self-supervised procedure to identify the subset of the most informative features from each data point. Then, we design a model that predicts cluster assignments and a gate matrix that provides cluster-level feature selection. Overall, our model provides cluster assignments with an indication of the driving feature for each sample and each cluster. We show that the proposed method can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jsvir/idc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research · Biomedical Text Mining and Ontologies · Explainable Artificial Intelligence (XAI)