Can I Trust the Explanations? Investigating Explainable Machine Learning Methods for Monotonic Models
Dangxing Chen

TL;DR
This paper investigates the reliability of explainable machine learning methods when applied to science-informed monotonic models, revealing that different explanation techniques perform variably depending on the type of monotonicity involved.
Contribution
It introduces axioms for monotonicity in models and evaluates explanation methods, highlighting their effectiveness in different monotonicity scenarios.
Findings
Shapley value provides good explanations for individual monotonicity.
Integrated gradients are effective for models with strong pairwise monotonicity.
Explanation method performance varies with the type of monotonicity in the model.
Abstract
In recent years, explainable machine learning methods have been very successful. Despite their success, most explainable machine learning methods are applied to black-box models without any domain knowledge. By incorporating domain knowledge, science-informed machine learning models have demonstrated better generalization and interpretation. But do we obtain consistent scientific explanations if we apply explainable machine learning methods to science-informed machine learning models? This question is addressed in the context of monotonic models that exhibit three different types of monotonicity. To demonstrate monotonicity, we propose three axioms. Accordingly, this study shows that when only individual monotonicity is involved, the baseline Shapley value provides good explanations; however, when strong pairwise monotonicity is involved, the Integrated gradients method provides…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning in Healthcare · Machine Learning and Data Classification
