Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to   CNNs

Philipp Benz; Soomin Ham; Chaoning Zhang; Adil Karjauv; In So Kweon

arXiv:2110.02797·cs.CV·October 12, 2021·1 cites

Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs

Philipp Benz, Soomin Ham, Chaoning Zhang, Adil Karjauv, In So Kweon

PDF

Open Access 1 Repo

TL;DR

This paper empirically compares the adversarial robustness of Vision Transformers and MLP-Mixers to CNNs, finding that ViT is generally more robust and that frequency features influence robustness, with MLP-Mixer being highly vulnerable to universal attacks.

Contribution

It provides the first comprehensive empirical evaluation of adversarial robustness across ViT, MLP-Mixer, and CNN architectures, highlighting their differences and underlying factors.

Findings

01

ViT is more robust than CNNs against adversarial attacks.

02

MLP-Mixer is extremely vulnerable to universal adversarial perturbations.

03

Low-frequency features contribute to ViT's robustness.

Abstract

Convolutional Neural Networks (CNNs) have become the de facto gold standard in computer vision applications in the past years. Recently, however, new model architectures have been proposed challenging the status quo. The Vision Transformer (ViT) relies solely on attention modules, while the MLP-Mixer architecture substitutes the self-attention modules with Multi-Layer Perceptrons (MLPs). Despite their great success, CNNs have been widely known to be vulnerable to adversarial attacks, causing serious concerns for security-sensitive applications. Thus, it is critical for the community to know whether the newly proposed ViT and MLP-Mixer are also vulnerable to adversarial attacks. To this end, we empirically evaluate their adversarial robustness under several adversarial attack setups and benchmark them against the widely used CNNs. Overall, we find that the two architectures, especially…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

phibenz/robustness_comparison_vit_mlp-mixer_cnn
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Integrated Circuits and Semiconductor Failure Analysis

MethodsAttention Is All You Need · Linear Layer · Average Pooling · Global Average Pooling · Refunds@Expedia|||How do I get a full refund from Expedia? · Position-Wise Feed-Forward Layer · Adam · Byte Pair Encoding · MLP-Mixer · Dropout