Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples
Maura Pintor, Luca Demetrio, Angelo Sotgiu, Ambra Demontis, Nicholas, Carlini, Battista Biggio, Fabio Roli

TL;DR
This paper introduces automatic indicators and a systematic protocol to debug and improve the evaluation of adversarial robustness in machine learning models, addressing current limitations and unveiling new attack failure modes.
Contribution
It categorizes attack failures, proposes six novel failure indicators, and offers a systematic debugging protocol to enhance adversarial robustness evaluations.
Findings
Indicators successfully detect attack failures in various models.
Systematic debugging improves attack effectiveness and evaluation accuracy.
Framework generalizes across multiple application domains.
Abstract
Evaluating robustness of machine-learning models to adversarial examples is a challenging problem. Many defenses have been shown to provide a false sense of robustness by causing gradient-based attacks to fail, and they have been broken under more rigorous evaluations. Although guidelines and best practices have been suggested to improve current adversarial robustness evaluations, the lack of automatic testing and debugging tools makes it difficult to apply these recommendations in a systematic manner. In this work, we overcome these limitations by: (i) categorizing attack failures based on how they affect the optimization of gradient-based attacks, while also unveiling two novel failures affecting many popular attack implementations and past evaluations; (ii) proposing six novel indicators of failure, to automatically detect the presence of such failures in the attack optimization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Software Testing and Debugging Techniques
