Adversarial Robustness as a Prior for Learned Representations

Logan Engstrom; Andrew Ilyas; Shibani Santurkar; Dimitris Tsipras,; Brandon Tran; Aleksander Madry

arXiv:1906.00945·stat.ML·September 30, 2019·76 cites

Adversarial Robustness as a Prior for Learned Representations

Logan Engstrom, Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras,, Brandon Tran, Aleksander Madry

PDF

Open Access 5 Repos

TL;DR

This paper demonstrates that adversarial robustness in deep learning models acts as a prior that enhances the quality and interpretability of learned feature representations, making them more invertible and manipulable.

Contribution

It shows that robust optimization enforces priors on features, leading to more high-level, invertible, and interpretable representations in neural networks.

Findings

01

Robust models produce more invertible representations.

02

Robust representations allow visualization and manipulation of input features.

03

Adversarial robustness improves the quality of learned features.

Abstract

An important goal in deep learning is to learn versatile, high-level feature representations of input data. However, standard networks' representations seem to possess shortcomings that, as we illustrate, prevent them from fully realizing this goal. In this work, we show that robust optimization can be re-cast as a tool for enforcing priors on the features learned by deep neural networks. It turns out that representations learned by robust models address the aforementioned shortcomings and make significant progress towards learning a high-level encoding of inputs. In particular, these representations are approximately invertible, while allowing for direct visualization and manipulation of salient input features. More broadly, our results indicate adversarial robustness as a promising avenue for improving learned representations. Our code and models for reproducing these results is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications