Interpreting Low-level Vision Models with Causal Effect Maps
Jinfan Hu, Jinjin Gu, Shiyao Yu, Fanghua Yu, Zheyuan Li, Zhiyuan You,, Chaochao Lu, Chao Dong

TL;DR
This paper introduces Causal Effect Maps (CEM), a causality-based interpretability method for low-level vision models, revealing insights about input-output relationships and challenging assumptions about input information and global mechanisms.
Contribution
The paper proposes a novel, model- and task-agnostic causality-based interpretability method called CEM for low-level vision models, providing new insights into their behavior.
Findings
More input information does not always improve performance.
Global receptive field mechanisms may be ineffective in denoising.
Multi-task training encourages local over global information use.
Abstract
Deep neural networks have significantly improved the performance of low-level vision tasks but also increased the difficulty of interpretability. A deep understanding of deep models is beneficial for both network design and practical reliability. To take up this challenge, we introduce causality theory to interpret low-level vision models and propose a model-/task-agnostic method called Causal Effect Map (CEM). With CEM, we can visualize and quantify the input-output relationships on either positive or negative effects. After analyzing various low-level vision tasks with CEM, we have reached several interesting insights, such as: (1) Using more information of input images (e.g., larger receptive field) does NOT always yield positive outcomes. (2) Attempting to incorporate mechanisms with a global receptive field (e.g., channel attention) into image denoising may prove futile. (3)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning
