AIM: Adversarial Information Masking for Faithfulness Evaluation of Saliency Maps
Chia-Ying Hsieh,Hsin-Yuan Fang,Chun-Shu Wei

TL;DR
AIM introduces an adversarial feature replacement framework to evaluate the faithfulness of saliency maps, addressing biases caused by traditional masking methods across multiple modalities.
Contribution
The paper proposes a novel adversarial masking approach that improves the reliability of saliency map evaluations by reducing masking-induced bias.
Findings
AIM reduces bias compared to zero and interpolation masking methods.
The method reveals modality-dependent differences in attribution methods.
Experiments across image, audio, and EEG data validate AIM's effectiveness.
Abstract
Post-hoc saliency methods are widely used to interpret deep neural networks, but their faithfulness is difficult to evaluate reliably. Existing evaluations mask features according to saliency-induced feature ordering and measure performance degradation, but this degradation can be confounded by the masking operator: zero masking may create out-of-distribution artifacts, while interpolation-based masking may preserve residual predictive information. We propose Adversarial Information Masking (AIM), a saliency-guided adversarial feature replacement framework for evaluating both saliency-map faithfulness and masking-operator reliability. AIM replaces selected features with values from an adversarial counterpart of the input and compares degradation under complementary masking orders. We assess reliability using random-attribution bias and stability of explanation-method faithfulness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
