Robustness May Be at Odds with Accuracy

Dimitris Tsipras; Shibani Santurkar; Logan Engstrom; Alexander Turner,; Aleksander Madry

arXiv:1805.12152·stat.ML·September 10, 2019·373 cites

Robustness May Be at Odds with Accuracy

Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner,, Aleksander Madry

PDF

Open Access 5 Repos

TL;DR

This paper demonstrates a fundamental trade-off between adversarial robustness and standard accuracy, showing that robust models often learn different features that may reduce accuracy but align better with human perception.

Contribution

It proves the existence of a robustness-accuracy trade-off in simple settings and links this to different feature representations learned by robust classifiers.

Findings

01

Robust models may have lower standard accuracy.

02

Robust classifiers learn different, more human-aligned features.

03

Trade-off is provably demonstrated in simple models.

Abstract

We show that there may exist an inherent tension between the goal of adversarial robustness and that of standard generalization. Specifically, training robust models may not only be more resource-consuming, but also lead to a reduction of standard accuracy. We demonstrate that this trade-off between the standard accuracy of a model and its robustness to adversarial perturbations provably exists in a fairly simple and natural setting. These findings also corroborate a similar phenomenon observed empirically in more complex settings. Further, we argue that this phenomenon is a consequence of robust classifiers learning fundamentally different feature representations than standard classifiers. These differences, in particular, seem to result in unexpected benefits: the representations learned by robust models tend to align better with salient data characteristics and human perception.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning