Do Adversarially Robust ImageNet Models Transfer Better?
Hadi Salman, Andrew Ilyas, Logan Engstrom, Ashish Kapoor, Aleksander, Madry

TL;DR
Adversarially robust ImageNet models, despite lower accuracy, transfer better to downstream tasks than standard models, suggesting robustness enhances feature representations for transfer learning.
Contribution
This work demonstrates that adversarial robustness in ImageNet models improves transfer learning performance, challenging the focus on accuracy alone.
Findings
Robust models outperform standard models on downstream tasks.
Robust models have more transferable feature representations.
Results support the hypothesis that robustness enhances feature quality.
Abstract
Transfer learning is a widely-used paradigm in deep learning, where models pre-trained on standard datasets can be efficiently adapted to downstream tasks. Typically, better pre-trained models yield better transfer results, suggesting that initial accuracy is a key aspect of transfer learning performance. In this work, we identify another such aspect: we find that adversarially robust models, while less accurate, often perform better than their standard-trained counterparts when used for transfer learning. Specifically, we focus on adversarially robust ImageNet classifiers, and show that they yield improved accuracy on a standard suite of downstream classification tasks. Further analysis uncovers more differences between robust and standard models in the context of transfer learning. Our results are consistent with (and in fact, add to) recent hypotheses stating that robustness leads to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
