Are Transformers More Robust Than CNNs?

Yutong Bai; Jieru Mei; Alan Yuille; Cihang Xie

arXiv:2111.05464·cs.CV·November 11, 2021·93 cites

Are Transformers More Robust Than CNNs?

Yutong Bai, Jieru Mei, Alan Yuille, Cihang Xie

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper provides a fair comparison between Transformers and CNNs in visual recognition, revealing that CNNs can be as robust as Transformers with proper training, challenging previous beliefs about their robustness advantages.

Contribution

It offers the first fair, in-depth robustness comparison between Transformers and CNNs, showing CNNs can match Transformers' robustness with appropriate training methods.

Findings

01

CNNs can be as robust as Transformers with proper training

02

Pre-training on large datasets is not essential for Transformers' performance

03

Transformers' self-attention architecture benefits generalization

Abstract

Transformer emerges as a powerful tool for visual recognition. In addition to demonstrating competitive performance on a broad range of visual benchmarks, recent works also argue that Transformers are much more robust than Convolutions Neural Networks (CNNs). Nonetheless, surprisingly, we find these conclusions are drawn from unfair experimental settings, where Transformers and CNNs are compared at different scales and are applied with distinct training frameworks. In this paper, we aim to provide the first fair & in-depth comparisons between Transformers and CNNs, focusing on robustness evaluations. With our unified training setup, we first challenge the previous belief that Transformers outshine CNNs when measuring adversarial robustness. More surprisingly, we find CNNs can easily be as robust as Transformers on defending against adversarial attacks, if they properly adopt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ytongbai/ViTs-vs-CNNs
pytorchOfficial

Videos

Are Transformers more robust than CNNs?· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications