Robust Explanations Through Uncertainty Decomposition: A Path to Trustworthier AI
Chenrui Zhu, Louenas Bounia, Vu Linh Nguyen, S\'ebastien Destercke, Arthur Hoarau

TL;DR
This paper introduces an uncertainty-aware framework for explainability in AI, distinguishing data and model uncertainties to improve the robustness and reliability of explanations in machine learning models.
Contribution
It proposes a novel method that uses aleatoric and epistemic uncertainty to guide explanation selection and assess explanation reliability, enhancing interpretability.
Findings
Uncertainty-guided explanations improve robustness.
Epistemic uncertainty indicates insufficient training.
Framework applies to both traditional and deep learning models.
Abstract
Recent advancements in machine learning have emphasized the need for transparency in model predictions, particularly as interpretability diminishes when using increasingly complex architectures. In this paper, we propose leveraging prediction uncertainty as a complementary approach to classical explainability methods. Specifically, we distinguish between aleatoric (data-related) and epistemic (model-related) uncertainty to guide the selection of appropriate explanations. Epistemic uncertainty serves as a rejection criterion for unreliable explanations and, in itself, provides insight into insufficient training (a new form of explanation). Aleatoric uncertainty informs the choice between feature-importance explanations and counterfactual explanations. This leverages a framework of explainability methods driven by uncertainty quantification and disentanglement. Our experiments demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
