Smoothed Geometry for Robust Attribution

Zifan Wang; Haofan Wang; Shakul Ramkumar; Matt Fredrikson; Piotr; Mardziel; Anupam Datta

arXiv:2006.06643·cs.LG·October 23, 2020·19 cites

Smoothed Geometry for Robust Attribution

Zifan Wang, Haofan Wang, Shakul Ramkumar, Matt Fredrikson, Piotr, Mardziel, Anupam Datta

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces geometric regularization and stochastic smoothing techniques to enhance the robustness of feature attribution methods in deep neural networks against adversarial attacks, ensuring more trustworthy explanations.

Contribution

It proposes novel regularization and smoothing methods that improve attribution robustness by promoting Lipschitz continuity and smooth geometry in DNNs.

Findings

01

Regularization improves attribution stability against attacks.

02

Stochastic smoothing enhances robustness without retraining.

03

Methods are effective on large-scale image models.

Abstract

Feature attributions are a popular tool for explaining the behavior of Deep Neural Networks (DNNs), but have recently been shown to be vulnerable to attacks that produce divergent explanations for nearby inputs. This lack of robustness is especially problematic in high-stakes applications where adversarially-manipulated explanations could impair safety and trustworthiness. Building on a geometric understanding of these attacks presented in recent work, we identify Lipschitz continuity conditions on models' gradient that lead to robust gradient-based attributions, and observe that smoothness may also be related to the ability of an attack to transfer across multiple attribution methods. To mitigate these attacks in practice, we propose an inexpensive regularization method that promotes these conditions in DNNs, as well as a stochastic smoothing technique that does not require…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zifanw/smoothed_geometry
tfOfficial

Videos

Smoothed Geometry for Robust Attribution· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Machine Learning and Algorithms