Making Sense of CNNs: Interpreting Deep Representations & Their   Invariances with INNs

Robin Rombach; Patrick Esser; Bj\"orn Ommer

arXiv:2008.01777·cs.CV·August 6, 2020

Making Sense of CNNs: Interpreting Deep Representations & Their Invariances with INNs

Robin Rombach, Patrick Esser, Bj\"orn Ommer

PDF

Open Access 1 Repo

TL;DR

This paper introduces an invertible method using INNs to interpret CNN representations by uncovering semantic concepts and invariances, enabling post-hoc understanding and modification of neural network models without performance loss.

Contribution

It presents a novel invertible approach that disentangles and visualizes learned invariances and semantic concepts in CNNs, enhancing interpretability.

Findings

01

Enables semantic understanding of CNN representations

02

Allows modification of learned invariances

03

Maintains model performance after interpretation

Abstract

To tackle increasingly complex tasks, it has become an essential ability of neural networks to learn abstract representations. These task-specific representations and, particularly, the invariances they capture turn neural networks into black box models that lack interpretability. To open such a black box, it is, therefore, crucial to uncover the different semantic concepts a model has learned as well as those that it has learned to be invariant to. We present an approach based on INNs that (i) recovers the task-specific, learned invariances by disentangling the remaining factor of variation in the data and that (ii) invertibly transforms these recovered invariances combined with the model representation into an equally expressive one with accessible semantic concepts. As a consequence, neural network representations become understandable by providing the means to (i) expose their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

CompVis/invariances
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Explainable Artificial Intelligence (XAI)