Explaining Classifiers by Constructing Familiar Concepts

Johannes Schneider; Michail Vlachos

arXiv:2203.04109·cs.CV·March 9, 2022

Explaining Classifiers by Constructing Familiar Concepts

Johannes Schneider, Michail Vlachos

PDF

Open Access 1 Repo

TL;DR

This paper introduces ClaDec, a neural network architecture that helps interpret deep learning models by reconstructing inputs in familiar concepts, making it easier for humans to understand what features are used for classification.

Contribution

ClaDec is a novel architecture that transforms neuron representations into human-understandable concepts, improving interpretability of deep neural networks.

Findings

01

Reconstructed images with ClaDec capture more relevant classification concepts than traditional auto-encoders.

02

Even non-experts can identify key concepts in images using ClaDec visualizations.

03

ClaDec highlights more relevant input areas compared to saliency-based methods.

Abstract

Interpreting a large number of neurons in deep learning is difficult. Our proposed `CLAssifier-DECoder' architecture (ClaDec) facilitates the understanding of the output of an arbitrary layer of neurons or subsets thereof. It uses a decoder that transforms the incomprehensible representation of the given neurons to a representation that is more similar to the domain a human is familiar with. In an image recognition problem, one can recognize what information (or concepts) a layer maintains by contrasting reconstructed images of ClaDec with those of a conventional auto-encoder(AE) serving as reference. An extension of ClaDec allows trading comprehensibility and fidelity. We evaluate our approach for image classification using convolutional neural networks. We show that reconstructed visualizations using encodings from a classifier capture more relevant classification information than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JohnTailor/ClaDec
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCell Image Analysis Techniques · Explainable Artificial Intelligence (XAI) · Machine Learning in Materials Science