A Note on Implementation Errors in Recent Adaptive Attacks Against Multi-Resolution Self-Ensembles
Stanislav Fort

TL;DR
This paper identifies an implementation flaw in recent adaptive attacks on multi-resolution self-ensembles, showing that when properly constrained, the defense remains robust and the attacks align with human perception, emphasizing careful validation.
Contribution
It uncovers a critical implementation error in recent adaptive attacks and demonstrates the importance of correct attack constraints for accurate robustness evaluation.
Findings
Properly constrained attacks do not break the defense.
Attacks aligned with human perception when correctly bounded.
Highlights need for careful validation in adversarial ML research.
Abstract
This note documents an implementation issue in recent adaptive attacks (Zhang et al. [2024]) against the multi-resolution self-ensemble defense (Fort and Lakshminarayanan [2024]). The implementation allowed adversarial perturbations to exceed the standard bound by up to a factor of 20, reaching magnitudes of up to . When attacks are properly constrained within the intended bounds, the defense maintains non-trivial robustness. Beyond highlighting the importance of careful validation in adversarial machine learning research, our analysis reveals an intriguing finding: properly bounded adaptive attacks against strong multi-resolution self-ensembles often align with human perception, suggesting the need to reconsider how we measure adversarial robustness.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management
MethodsALIGN
