Learning Invariances for Interpretability using Supervised VAE

An-phi Nguyen; Mar\'ia Rodr\'iguez Mart\'inez

arXiv:2007.07591·cs.LG·July 16, 2020·1 cites

Learning Invariances for Interpretability using Supervised VAE

An-phi Nguyen, Mar\'ia Rodr\'iguez Mart\'inez

PDF

Open Access

TL;DR

This paper introduces a supervised variational auto-encoder framework to learn and interpret model invariances, enabling better understanding of how complex models solve problems by analyzing invariant transformations.

Contribution

It proposes a novel supervised VAE approach that isolates nuisance parameters to reveal model invariances, enhancing interpretability of supervised models.

Findings

01

Successfully generates invariant transformations that do not change classification.

02

Improves understanding of model decision processes through invariance analysis.

03

Enhances classification and sample generation with invariance-aware models.

Abstract

We propose to learn model invariances as a means of interpreting a model. This is motivated by a reverse engineering principle. If we understand a problem, we may introduce inductive biases in our model in the form of invariances. Conversely, when interpreting a complex supervised model, we can study its invariances to understand how that model solves a problem. To this end we propose a supervised form of variational auto-encoders (VAEs). Crucially, only a subset of the dimensions in the latent space contributes to the supervised task, allowing the remaining dimensions to act as nuisance parameters. By sampling solely the nuisance dimensions, we are able to generate samples that have undergone transformations that leave the classification unchanged, revealing the invariances of the model. Our experimental results show the capability of our proposed model both in terms of classification,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning