Verifying Attention Robustness of Deep Neural Networks against Semantic Perturbations
Satoshi Munakata, Caterina Urban, Haruki Yokoyama, Koji Yamamoto, and, Kazuki Munakata

TL;DR
This paper introduces a verification method for assessing the robustness of DNNs' attention mechanisms against semantic perturbations, ensuring consistent saliency-maps and classification basis despite input changes.
Contribution
It presents the first approach to verify attention robustness in DNNs, using activation region traversals to determine perturbation ranges that preserve saliency-map consistency.
Findings
Method effectively measures attention robustness against semantic changes.
Activation region traversal enhances scalability for larger DNNs.
Experimental results validate the method's ability to quantify classification basis stability.
Abstract
It is known that deep neural networks (DNNs) classify an input image by paying particular attention to certain specific pixels; a graphical representation of the magnitude of attention to each pixel is called a saliency-map. Saliency-maps are used to check the validity of the classification decision basis, e.g., it is not a valid basis for classification if a DNN pays more attention to the background rather than the subject of an image. Semantic perturbations can significantly change the saliency-map. In this work, we propose the first verification method for attention robustness, i.e., the local robustness of the changes in the saliency-map against combinations of semantic perturbations. Specifically, our method determines the range of the perturbation parameters (e.g., the brightness change) that maintains the difference between the actual saliency-map change and the expected…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Explainable Artificial Intelligence (XAI)
