Explaining Deep Learning Representations by Tracing the Training Process

Lukas Pfahler; Katharina Morik

arXiv:2109.05880·cs.LG·September 14, 2021·1 cites

Explaining Deep Learning Representations by Tracing the Training Process

Lukas Pfahler, Katharina Morik

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new explanation method for deep neural networks that traces how intermediate representations evolve during training, identifying influential training examples and class contributions.

Contribution

It presents a general approach applicable to various architectures and training procedures, enabling detailed analysis of training dynamics and decision explanations.

Findings

01

Identifies influential training examples for model decisions

02

Provides visualization of training process and class contributions

03

Works with both single-instance and mini-batch training

Abstract

We propose a novel explanation method that explains the decisions of a deep neural network by investigating how the intermediate representations at each layer of the deep network were refined during the training process. This way we can a) find the most influential training examples during training and b) analyze which classes attributed most to the final representation. Our method is general: it can be wrapped around any iterative optimization procedure and covers a variety of neural network architectures, including feed-forward networks and convolutional neural networks. We first propose a method for stochastic training with single training instances, but continue to also derive a variant for the common mini-batch training. In experimental evaluations, we show that our method identifies highly representative training instances that can be used as an explanation. Additionally, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

whadup/xai_tracking
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification