Certified $\ell_2$ Attribution Robustness via Uniformly Smoothed   Attributions

Fan Wang; Adams Wai-Kin Kong

arXiv:2405.06361·cs.LG·May 13, 2024

Certified $\ell_2$ Attribution Robustness via Uniformly Smoothed Attributions

Fan Wang, Adams Wai-Kin Kong

PDF

Open Access

TL;DR

This paper introduces a certified defense method for model attribution robustness using uniform smoothing, guaranteeing attribution stability against small input perturbations across various models and datasets.

Contribution

It proposes a novel uniform smoothing technique for attribution robustness and provides theoretical certification bounds for perturbation sizes.

Findings

01

Effective protection of attributions from attacks across different architectures.

02

The certification guarantees hold within a defined perturbation region.

03

Method demonstrates robustness on three diverse datasets.

Abstract

Model attribution is a popular tool to explain the rationales behind model predictions. However, recent work suggests that the attributions are vulnerable to minute perturbations, which can be added to input samples to fool the attributions while maintaining the prediction outputs. Although empirical studies have shown positive performance via adversarial training, an effective certified defense method is eminently needed to understand the robustness of attributions. In this work, we propose to use uniform smoothing technique that augments the vanilla attributions by noises uniformly sampled from a certain space. It is proved that, for all perturbations within the attack region, the cosine similarity between uniformly smoothed attribution of perturbed sample and the unperturbed sample is guaranteed to be lower bounded. We also derive alternative formulations of the certification that is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Criteria Decision Making · Risk and Portfolio Optimization · Bayesian Modeling and Causal Inference