Exploring the Relationship between Architecture and Adversarially Robust   Generalization

Aishan Liu; Shiyu Tang; Siyuan Liang; Ruihao Gong; Boxi Wu; Xianglong; Liu; Dacheng Tao

arXiv:2209.14105·cs.LG·March 13, 2023·5 cites

Exploring the Relationship between Architecture and Adversarially Robust Generalization

Aishan Liu, Shiyu Tang, Siyuan Liang, Ruihao Gong, Boxi Wu, Xianglong, Liu, Dacheng Tao

PDF

Open Access

TL;DR

This paper systematically investigates how neural network architecture influences adversarially robust generalization, revealing that Vision Transformers often outperform CNNs due to higher weight sparsity, supported by theoretical analysis.

Contribution

It is the first comprehensive study linking architecture design, especially attention mechanisms, to adversarial robustness and generalization in deep neural networks.

Findings

01

Vision Transformers show better adversarial generalization than CNNs.

02

Higher weight sparsity correlates with improved robustness.

03

Theoretical analysis via Rademacher complexity explains the observed differences.

Abstract

Adversarial training has been demonstrated to be one of the most effective remedies for defending adversarial examples, yet it often suffers from the huge robustness generalization gap on unseen testing adversaries, deemed as the adversarially robust generalization problem. Despite the preliminary understandings devoted to adversarially robust generalization, little is known from the architectural perspective. To bridge the gap, this paper for the first time systematically investigated the relationship between adversarially robust generalization and architectural design. Inparticular, we comprehensively evaluated 20 most representative adversarially trained architectures on ImageNette and CIFAR-10 datasets towards multiple `p-norm adversarial attacks. Based on the extensive experiments, we found that, under aligned settings, Vision Transformers (e.g., PVT, CoAtNet) often yield better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Integrated Circuits and Semiconductor Failure Analysis

MethodsAttention Is All You Need · Linear Layer · Softmax · Residual Connection · Dense Connections · Layer Normalization · Multi-Head Attention · Absolute Position Encodings · Spatial-Reduction Attention · Pyramid Vision Transformer