Interpreting and improving deep-learning models with reality checks
Chandan Singh, Wooseok Ha, and Bin Yu

TL;DR
This paper reviews recent methods for interpreting deep-learning models through feature importance and interactions, emphasizing the use of reality checks to validate these interpretations and improve model generalization across various domains.
Contribution
It introduces a framework for attribution methods that include feature interactions and demonstrates their application in real-world domains, with a focus on validation via reality checks.
Findings
Attributions reveal meaningful feature interactions in diverse applications
Reality checks effectively validate interpretation techniques
Interpreting models can lead to improved generalization and simpler models
Abstract
Recent deep-learning models have achieved impressive predictive performance by learning complex functions of many variables, often at the cost of interpretability. This chapter covers recent work aiming to interpret models by attributing importance to features and feature groups for a single prediction. Importantly, the proposed attributions assign importance to interactions between features, in addition to features in isolation. These attributions are shown to yield insights across real-world domains, including bio-imaging, cosmology image and natural-language processing. We then show how these attributions can be used to directly improve the generalization of a neural network or to distill it into a simple model. Throughout the chapter, we emphasize the use of reality checks to scrutinize the proposed interpretation techniques.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Radiomics and Machine Learning in Medical Imaging · Machine Learning in Healthcare
