Federated Adversarial Training with Transformers
Ahmed Aldahdooh, Wassim Hamidouche, Olivier D\'eforges

TL;DR
This paper explores the application of adversarial training in federated learning for vision transformers, proposing a new aggregation method, FedWAvg, to enhance model robustness against adversarial attacks.
Contribution
It introduces FedWAvg, a novel federated aggregation technique that improves robustness of vision transformer models trained with adversarial training in non-IID settings.
Findings
FedWAvg outperforms existing aggregation methods in robust accuracy.
Adversarial training is feasible and effective for vision transformers in federated learning.
The proposed method enhances model robustness against evasion attacks.
Abstract
Federated learning (FL) has emerged to enable global model training over distributed clients' data while preserving its privacy. However, the global trained model is vulnerable to the evasion attacks especially, the adversarial examples (AEs), carefully crafted samples to yield false classification. Adversarial training (AT) is found to be the most promising approach against evasion attacks and it is widely studied for convolutional neural network (CNN). Recently, vision transformers have been found to be effective in many computer vision tasks. To the best of the authors' knowledge, there is no work that studied the feasibility of AT in a FL process for vision transformers. This paper investigates such feasibility with different federated model aggregation methods and different vision transformer models with different tokenization and classification head techniques. In order to improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Layer Normalization · Residual Connection · Dense Connections · Vision Transformer
