Evaluating and Understanding the Robustness of Adversarial Logit Pairing
Logan Engstrom, Andrew Ilyas, Anish Athalye

TL;DR
This paper critically evaluates the robustness of Adversarial Logit Pairing, revealing its significant vulnerability to adversarial attacks and providing insights into its limitations within the specified threat model.
Contribution
The study offers the first comprehensive assessment of ALP's robustness, highlighting its weaknesses and analyzing the reasons behind its vulnerability to adversarial attacks.
Findings
ALP achieves only 0.6% accuracy under attack.
The defense is highly vulnerable to adversarial examples.
Insights into why ALP fails against attacks.
Abstract
We evaluate the robustness of Adversarial Logit Pairing, a recently proposed defense against adversarial examples. We find that a network trained with Adversarial Logit Pairing achieves 0.6% accuracy in the threat model in which the defense is considered. We provide a brief overview of the defense and the threat models/claims considered, as well as a discussion of the methodology and results of our attack, which may offer insights into the reasons underlying the vulnerability of ALP to adversarial attack.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Bacillus and Francisella bacterial research · Advanced Malware Detection Techniques
