Revisiting adapters with adversarial training

Sylvestre-Alvise Rebuffi; Francesco Croce; Sven Gowal

arXiv:2210.04886·cs.CV·October 11, 2022

Revisiting adapters with adversarial training

Sylvestre-Alvise Rebuffi, Francesco Croce, Sven Gowal

PDF

Open Access 1 Video

TL;DR

This paper demonstrates that using adapters with adversarial training in Vision Transformers improves clean accuracy, enables model soups for trade-offs, and adapts well to distribution shifts, all with fewer parameters.

Contribution

It shows that adapters suffice for co-training on clean and adversarial inputs without batch statistic separation, improving accuracy and enabling flexible model combinations.

Findings

01

Improved top-1 accuracy on ImageNet by +1.12%.

02

Enabled model soups for clean and adversarial trade-offs.

03

Achieved +4.00% better accuracy on ImageNet variants.

Abstract

While adversarial training is generally used as a defense mechanism, recent works show that it can also act as a regularizer. By co-training a neural network on clean and adversarial inputs, it is possible to improve classification accuracy on the clean, non-adversarial inputs. We demonstrate that, contrary to previous findings, it is not necessary to separate batch statistics when co-training on clean and adversarial inputs, and that it is sufficient to use adapters with few domain-specific parameters for each type of input. We establish that using the classification token of a Vision Transformer (ViT) as an adapter is enough to match the classification performance of dual normalization layers, while using significantly less additional parameters. First, we improve upon the top-1 accuracy of a non-adversarially trained ViT-B16 model by +1.12% on ImageNet (reaching 83.76% top-1…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Revisiting adapters with adversarial training· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications

MethodsModel Soups · Multi-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Dropout · Softmax · Label Smoothing · Adam · Adapter