Interpreting Adversarial Examples with Attributes
Sadaf Gulshad, Jan Hendrik Metzen, Arnold Smeulders, Zeynep Akata

TL;DR
This paper introduces a method for interpreting adversarial examples in deep vision models by using attributes to justify decisions and analyze robustness, providing insights into model behavior under attack.
Contribution
It proposes a novel attribute-based approach to interpret and analyze adversarial examples, enabling black-box models to justify their decisions with visual attributes.
Findings
Attributes can effectively explain model decisions on clean and adversarial images.
Attribute relevance ranking correlates with decision changes under perturbations.
The approach improves understanding of model robustness and decision-making processes.
Abstract
Deep computer vision systems being vulnerable to imperceptible and carefully crafted noise have raised questions regarding the robustness of their decisions. We take a step back and approach this problem from an orthogonal direction. We propose to enable black-box neural networks to justify their reasoning both for clean and for adversarial examples by leveraging attributes, i.e. visually discriminative properties of objects. We rank attributes based on their class relevance, i.e. how the classification decision changes when the input is visually slightly perturbed, as well as image relevance, i.e. how well the attributes can be localized on both clean and perturbed images. We present comprehensive experiments for attribute prediction, adversarial example generation, adversarially robust learning, and their qualitative and quantitative analysis using predicted attributes on three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Explainable Artificial Intelligence (XAI)
