TL;DR
This paper explores the semantic interpretation of thoracic disease diagnosis models, analyzing internal representations and proposing semantic attribution methods to better understand model behavior on chest X-ray data.
Contribution
It introduces a method to semantically interpret CNNs for thoracic disease diagnosis and investigates how training data and pretraining affect interpretability.
Findings
Internal units correlate with specific thoracic pathologies
Pretraining improves interpretability of learned features
Weakly trained models implicitly learn pathology patterns
Abstract
Convolutional neural networks are showing promise in the automatic diagnosis of thoracic pathologies on chest x-rays. Their black-box nature has sparked many recent works to explain the prediction via input feature attribution methods (aka saliency methods). However, input feature attribution methods merely identify the importance of input regions for the prediction and lack semantic interpretation of model behavior. In this work, we first identify the semantics associated with internal units (feature maps) of the network. We proceed to investigate the following questions; Does a regression model that is only trained with COVID-19 severity scores implicitly learn visual patterns associated with thoracic pathologies? Does a network that is trained on weakly labeled data (e.g. healthy, unhealthy) implicitly learn pathologies? Moreover, we investigate the effect of pretraining and data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
