Lagrangian Objective Function Leads to Improved Unforeseen Attack Generalization in Adversarial Training
Mohammad Azizmalayeri, Mohammad Hossein Rohban

TL;DR
This paper introduces a Lagrangian-based modification to adversarial training that improves the model's robustness against unseen attacks, demonstrating higher accuracy and faster attack generation on CIFAR-10 and ImageNet-100 datasets.
Contribution
The paper proposes a novel Lagrangian objective function for adversarial training that enhances attack generalization and robustness against unseen adversarial examples.
Findings
Robust accuracy is 5.9% higher on CIFAR-10 against unseen attacks.
Robust accuracy is 3.2% higher on ImageNet-100 against unseen attacks.
The proposed attack method is faster and scalable for large datasets.
Abstract
Recent improvements in deep learning models and their practical applications have raised concerns about the robustness of these models against adversarial examples. Adversarial training (AT) has been shown effective to reach a robust model against the attack that is used during training. However, it usually fails against other attacks, i.e. the model overfits to the training attack scheme. In this paper, we propose a simple modification to the AT that mitigates the mentioned issue. More specifically, we minimize the perturbation norm while maximizing the classification loss in the Lagrangian form. We argue that crafting adversarial examples based on this scheme results in enhanced attack generalization in the learned model. We compare our final model robust accuracy against attacks that were not used during training to closely related state-of-the-art AT methods. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning
