cc-Shapley: Measuring Multivariate Feature Importance Needs Causal Context
J\"org Martin, Stefan Haufe

TL;DR
This paper introduces cc-Shapley, a causal-context-aware modification of Shapley values, to accurately measure feature importance by incorporating causal knowledge, thus avoiding misleading associations caused by collider bias.
Contribution
It proposes a new causal-contextual approach to feature importance that corrects for biases inherent in purely data-driven methods like traditional Shapley values.
Findings
cc-Shapley removes spurious associations caused by collider bias.
Compared to traditional Shapley, cc-Shapley often reverses or nullifies feature importance.
Theoretical analysis confirms cc-Shapley's effectiveness in causal contexts.
Abstract
Explainable artificial intelligence promises to yield insights into relevant features, thereby enabling humans to examine and scrutinize machine learning models or even facilitating scientific discovery. Considering the widespread technique of Shapley values, we find that purely data-driven operationalization of multivariate feature importance is unsuitable for such purposes. Even for simple problems with two features, spurious associations due to collider bias and suppression arise from considering one feature only in the observational context of the other, which can lead to misinterpretations. Causal knowledge about the data-generating process is required to identify and correct such misleading feature attributions. We propose cc-Shapley (causal context Shapley), an interventional modification of conventional observational Shapley values leveraging knowledge of the data's causal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
