ELUDE: Generating interpretable explanations via a decomposition into   labelled and unlabelled features

Vikram V. Ramaswamy; Sunnie S. Y. Kim; Nicole Meister; Ruth Fong; Olga; Russakovsky

arXiv:2206.07690·cs.CV·June 20, 2022·5 cites

ELUDE: Generating interpretable explanations via a decomposition into labelled and unlabelled features

Vikram V. Ramaswamy, Sunnie S. Y. Kim, Nicole Meister, Ruth Fong, Olga, Russakovsky

PDF

Open Access

TL;DR

ELUDE is a novel explanation framework that decomposes neural network predictions into interpretable semantic attributes and a small set of uninterpretable features, providing deeper insights into model behavior.

Contribution

The paper introduces ELUDE, a new method that combines labeled and unlabeled features to explain complex models, extending beyond purely semantic attribute explanations.

Findings

01

ELUDE effectively decomposes predictions into explainable and unexplained parts.

02

Unlabeled features generalize across models trained on the same data.

03

ELUDE offers additional insights compared to existing attribute-based explanation methods.

Abstract

Deep learning models have achieved remarkable success in different areas of machine learning over the past decade; however, the size and complexity of these models make them difficult to understand. In an effort to make them more interpretable, several recent works focus on explaining parts of a deep neural network through human-interpretable, semantic attributes. However, it may be impossible to completely explain complex models using only semantic attributes. In this work, we propose to augment these attributes with a small set of uninterpretable features. Specifically, we develop a novel explanation framework ELUDE (Explanation via Labelled and Unlabelled DEcomposition) that decomposes a model's prediction into two parts: one that is explainable through a linear combination of the semantic attributes, and another that is dependent on the set of uninterpretable features. By…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning