Understanding Misclassifications by Attributes
Sadaf Gulshad, Zeynep Akata, Jan Hendrik Metzen, and Arnold Smeulders

TL;DR
This paper investigates how deep neural networks' attribute predictions change under adversarial attacks, revealing differences between standard and robust models and factors influencing robustness across datasets.
Contribution
It introduces a metric for network robustness against adversarial attacks and analyzes attribute behavior in standard and robust networks across datasets.
Findings
Adversarial images' attributes align with wrong classes in standard networks.
Robust networks maintain correct attribute-class consistency under attack.
Robustification effectiveness varies with dataset granularity and noise level.
Abstract
In this paper, we aim to understand and explain the decisions of deep neural networks by studying the behavior of predicted attributes when adversarial examples are introduced. We study the changes in attributes for clean as well as adversarial images in both standard and adversarially robust networks. We propose a metric to quantify the robustness of an adversarially robust network against adversarial attacks. In a standard network, attributes predicted for adversarial images are consistent with the wrong class, while attributes predicted for the clean images are consistent with the true class. In an adversarially robust network, the attributes predicted for adversarial images classified correctly are consistent with the true class. Finally, we show that the ability to robustify a network varies for different datasets. For the fine grained dataset, it is higher as compared to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Anomaly Detection Techniques and Applications
