Interpreting Classifiers through Attribute Interactions in Datasets

Andreas Henelius; Kai Puolam\"aki; Antti Ukkonen

arXiv:1707.07576·stat.ML·July 25, 2017·31 cites

Interpreting Classifiers through Attribute Interactions in Datasets

Andreas Henelius, Kai Puolam\"aki, Antti Ukkonen

PDF

Open Access

TL;DR

This paper introduces ASTRID, a novel method for uncovering attribute interactions in datasets that classifiers exploit, enhancing interpretability and revealing meaningful attribute associations in various applications.

Contribution

The paper presents ASTRID, a new approach for identifying attribute interactions used by classifiers, linking these interactions to data distribution factorization for better interpretability.

Findings

01

ASTRID effectively uncovers attribute interactions used by classifiers.

02

The method reveals meaningful attribute associations in real-world datasets.

03

Empirical results demonstrate the utility of ASTRID in interpretability tasks.

Abstract

In this work we present the novel ASTRID method for investigating which attribute interactions classifiers exploit when making predictions. Attribute interactions in classification tasks mean that two or more attributes together provide stronger evidence for a particular class label. Knowledge of such interactions makes models more interpretable by revealing associations between attributes. This has applications, e.g., in pharmacovigilance to identify interactions between drugs or in bioinformatics to investigate associations between single nucleotide polymorphisms. We also show how the found attribute partitioning is related to a factorisation of the data generating distribution and empirically demonstrate the utility of the proposed method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques · Computational Drug Discovery Methods · Pharmacovigilance and Adverse Drug Reactions