Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial Robustness
Xiao Yang, Yinpeng Dong, Wenzhao Xiang, Tianyu Pang, Hang Su, Jun Zhu

TL;DR
This paper introduces MAMA, a model-agnostic meta-attack method that automatically learns stronger adversarial attack algorithms, improving robustness evaluation of neural networks with minimal additional computation.
Contribution
It proposes a novel meta-learning approach to automatically discover effective adversarial attacks that generalize across different defenses, enhancing evaluation reliability.
Findings
MAMA outperforms state-of-the-art attacks on various defenses.
The learned optimizer generalizes well to unseen defenses.
The approach improves robustness evaluation with little extra computational cost.
Abstract
The vulnerability of deep neural networks to adversarial examples has motivated an increasing number of defense strategies for promoting model robustness. However, the progress is usually hampered by insufficient robustness evaluations. As the de facto standard to evaluate adversarial robustness, adversarial attacks typically solve an optimization problem of crafting adversarial examples with an iterative process. In this work, we propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically. Our method learns the optimizer in adversarial attacks parameterized by a recurrent neural network, which is trained over a class of data samples and defenses to produce effective update directions during adversarial example generation. Furthermore, we develop a model-agnostic training algorithm to improve the generalization ability of the learned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
