Towards Improved Input Masking for Convolutional Neural Networks
Sriram Balasubramanian, Soheil Feizi

TL;DR
This paper introduces a layer masking technique for CNNs that reduces missingness bias in input masking, improving interpretability and robustness of model explanations by minimizing the influence of mask shape and color.
Contribution
The authors propose layer masking, a novel method applying masks to intermediate activations in CNNs, which better mitigates missingness bias compared to traditional input masking approaches.
Findings
Layer masking minimizes influence of mask shape and color on model output.
It outperforms input replacement methods like black or grey masks in interpretability tasks.
Data augmentation alone is insufficient to prevent reliance on mask shape.
Abstract
The ability to remove features from the input of machine learning models is very important to understand and interpret model predictions. However, this is non-trivial for vision models since masking out parts of the input image typically causes large distribution shifts. This is because the baseline color used for masking (typically grey or black) is out of distribution. Furthermore, the shape of the mask itself can contain unwanted signals which can be used by the model for its predictions. Recently, there has been some progress in mitigating this issue (called missingness bias) in image masking for vision transformers. In this work, we propose a new masking method for CNNs we call layer masking in which the missingness bias caused by masking is reduced to a large extent. Intuitively, layer masking applies a mask to intermediate activation maps so that the model only processes the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Model Reduction and Neural Networks · Adversarial Robustness in Machine Learning
MethodsLocal Interpretable Model-Agnostic Explanations
