Model Explanations under Calibration
Rishabh Jain, Pranava Madhyastha

TL;DR
This paper investigates how model interpretability, via attention distributions, is affected by calibration issues in recommender systems, revealing instability and questioning its utility for explanations.
Contribution
It provides an empirical analysis of the stability of attention-based explanations under different calibration states of models.
Findings
Attention distributions are unstable for uncalibrated models.
Calibration affects the reliability of attention as an explanation.
Attention may not be a robust explanation method for uncalibrated models.
Abstract
Explaining and interpreting the decisions of recommender systems are becoming extremely relevant both, for improving predictive performance, and providing valid explanations to users. While most of the recent interest has focused on providing local explanations, there has been a much lower emphasis on studying the effects of model dynamics and its impact on explanation. In this paper, we perform a focused study on the impact of model interpretability in the context of calibration. Specifically, we address the challenges of both over-confident and under-confident predictions with interpretability using attention distribution. Our results indicate that the means of using attention distributions for interpretability are highly unstable for un-calibrated models. Our empirical analysis on the stability of attention distribution raises questions on the utility of attention for explainability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Data Stream Mining Techniques · Machine Learning in Healthcare
MethodsInterpretability
