A Theory of Diagnostic Interpretation in Supervised Classification
Anirban Mukhopadhyay

TL;DR
This paper presents a formal computational framework for interpretability in supervised classification, defining interpretability as maximal information gain through finite communication, and provides an algorithm to quantify diagnostic interpretability.
Contribution
It introduces a novel theoretical model for diagnostic interpretability that removes subjective bias and does not rely on accuracy-interpretability tradeoffs, with an algorithm to compute interpretability.
Findings
Algorithm successfully calculates diagnostic interpretability in synthetic scenarios.
Interpretability is framed as maximal information gain within finite communication.
The model demonstrates effectiveness across various complexity levels in simulations.
Abstract
Interpretable deep learning is a fundamental building block towards safer AI, especially when the deployment possibilities of deep learning-based computer-aided medical diagnostic systems are so eminent. However, without a computational formulation of black-box interpretation, general interpretability research rely heavily on subjective bias. Clear decision structure of the medical diagnostics lets us approximate the decision process of a radiologist as a model - removed from subjective bias. We define the process of interpretation as a finite communication between a known model and a black-box model to optimally map the black box's decision process in the known model. Consequently, we define interpretability as maximal information gain over the initial uncertainty about the black-box's decision within finite communication. We relax this definition based on the observation that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Topic Modeling
MethodsInterpretability
