Explaining, Evaluating and Enhancing Neural Networks' Learned   Representations

Marco Bertolini; Djork-Arn\'e Clevert; Floriane Montanari

arXiv:2202.09374·cs.LG·February 22, 2022

Explaining, Evaluating and Enhancing Neural Networks' Learned Representations

Marco Bertolini, Djork-Arn\'e Clevert, Floriane Montanari

PDF

Open Access

TL;DR

This paper introduces a new explainability framework for neural networks trained without specific tasks, proposing aggregation methods and evaluation scores that improve representation quality and downstream performance.

Contribution

It presents a novel aggregation method for attribution maps and introduces scores for evaluating informativeness and disentanglement of learned representations.

Findings

01

Scores correlate with desired properties of representations.

02

Adopting scores as constraints improves downstream task performance.

03

Saliency strategies can be independent of model parameters.

Abstract

Most efforts in interpretability in deep learning have focused on (1) extracting explanations of a specific downstream task in relation to the input features and (2) imposing constraints on the model, often at the expense of predictive performance. New advances in (unsupervised) representation learning and transfer learning, however, raise the need for an explanatory framework for networks that are trained without a specific downstream task. We address these challenges by showing how explainability can be an aid, rather than an obstacle, towards better and more efficient representations. Specifically, we propose a natural aggregation method generalizing attribution maps between any two (convolutional) layers of a neural network. Additionally, we employ such attributions to define two novel scores for evaluating the informativeness and the disentanglement of latent embeddings. Extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Adversarial Robustness in Machine Learning