Enhanced Regularizers for Attributional Robustness

Anindya Sarkar; Anirban Sarkar; Vineeth N Balasubramanian

arXiv:2012.14395·cs.CV·August 17, 2021

Enhanced Regularizers for Attributional Robustness

Anindya Sarkar, Anirban Sarkar, Vineeth N Balasubramanian

PDF

1 Repo 1 Video

TL;DR

This paper introduces a novel training strategy with regularizers to enhance the attributional robustness of deep neural networks, making their explanations more consistent under adversarial attacks across multiple datasets.

Contribution

It proposes two new regularizers specifically designed to preserve attribution maps during attacks, surpassing existing methods in attributional robustness.

Findings

01

Achieves 3-9% improvement in attribution robustness measures

02

Effective across datasets: MNIST, FMNIST, Flower, GTSRB

03

Provides a systematic approach to improve trustworthiness of explanations

Abstract

Deep neural networks are the default choice of learning models for computer vision tasks. Extensive work has been carried out in recent years on explaining deep models for vision tasks such as classification. However, recent work has shown that it is possible for these models to produce substantially different attribution maps even when two very similar images are given to the network, raising serious questions about trustworthiness. To address this issue, we propose a robust attribution training strategy to improve attributional robustness of deep neural networks. Our method carefully analyzes the requirements for attributional robustness and introduces two new regularizers that preserve a model's attribution map during attacks. Our method surpasses state-of-the-art attributional robustness methods by a margin of approximately 3% to 9% in terms of attribution robustness measures on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tataiani/enhanced_regularizers_attributional_robustness
tfOfficial

Videos

Enhanced Regularizers for Attributional Robustness· underline