Going Beyond Saliency Maps: Training Deep Models to Interpret Deep Models
Zixuan Liu, Ehsan Adeli, Kilian M. Pohl, Qingyu Zhao

TL;DR
This paper introduces a novel interpretability method for deep learning models in neuroimaging, using simulator networks to visualize disease-related brain patterns beyond traditional saliency maps.
Contribution
It proposes a new approach employing simulator networks and image warping to generate human-understandable visualizations of disease effects in brain images.
Findings
Simulations reveal meaningful disease-related patterns.
Method outperforms saliency maps in interpretability.
Effective on both synthetic and real neuroimaging data.
Abstract
Interpretability is a critical factor in applying complex deep learning models to advance the understanding of brain disorders in neuroimaging studies. To interpret the decision process of a trained classifier, existing techniques typically rely on saliency maps to quantify the voxel-wise or feature-level importance for classification through partial derivatives. Despite providing some level of localization, these maps are not human-understandable from the neuroscience perspective as they do not inform the specific meaning of the alteration linked to the brain disorder. Inspired by the image-to-image translation scheme, we propose to train simulator networks that can warp a given image to inject or remove patterns of the disease. These networks are trained such that the classifier produces consistently increased or decreased prediction logits for the simulated images. Moreover, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · Explainable Artificial Intelligence (XAI) · Machine Learning in Healthcare
