Interpreting Classifiers through Attribute Interactions in Datasets
Andreas Henelius, Kai Puolam\"aki, Antti Ukkonen

TL;DR
This paper introduces ASTRID, a novel method for uncovering attribute interactions in datasets that classifiers exploit, enhancing interpretability and revealing meaningful attribute associations in various applications.
Contribution
The paper presents ASTRID, a new approach for identifying attribute interactions used by classifiers, linking these interactions to data distribution factorization for better interpretability.
Findings
ASTRID effectively uncovers attribute interactions used by classifiers.
The method reveals meaningful attribute associations in real-world datasets.
Empirical results demonstrate the utility of ASTRID in interpretability tasks.
Abstract
In this work we present the novel ASTRID method for investigating which attribute interactions classifiers exploit when making predictions. Attribute interactions in classification tasks mean that two or more attributes together provide stronger evidence for a particular class label. Knowledge of such interactions makes models more interpretable by revealing associations between attributes. This has applications, e.g., in pharmacovigilance to identify interactions between drugs or in bioinformatics to investigate associations between single nucleotide polymorphisms. We also show how the found attribute partitioning is related to a factorisation of the data generating distribution and empirically demonstrate the utility of the proposed method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Computational Drug Discovery Methods · Pharmacovigilance and Adverse Drug Reactions
