How important are the genes to explain the outcome - the asymmetric Shapley value as an honest importance metric for high-dimensional features
Mark A. van de Wiel, Jeroen Goedhart, Martin Jullum, Kjersti Aas

TL;DR
This paper introduces asymmetric Shapley values as a more accurate and interpretable method for assessing feature importance in high-dimensional clinical prediction models, especially when dealing with genomic data and confounders.
Contribution
It develops efficient algorithms for computing local and global asymmetric Shapley values tailored for high-dimensional, mixed-variable clinical prediction settings, addressing collinearity and dependency issues.
Findings
Asymmetric Shapley values outperform traditional importance measures in clinical genomics.
Algorithms enable practical computation of importance scores in complex models.
Application to colorectal cancer data demonstrates improved interpretability.
Abstract
In clinical prediction settings the importance of a high-dimensional feature like genomics is often assessed by evaluating the change in predictive performance when adding it to a set of traditional clinical variables. This approach is questionable, because it does not account for collinearity nor known directionality of dependencies between variables. We suggest to use asymmetric Shapley values as a more suitable alternative to quantify feature importance in the context of a mixed-dimensional prediction model. We focus on a setting that is particularly relevant in clinical prediction: disease state as a mediating variable for genomic effects, with additional confounders for which the direction of effects may be unknown. We derive efficient algorithms to compute local and global asymmetric Shapley values for this setting. The former are shown to be very useful for inference, whereas the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Genetic Associations and Epidemiology · Bayesian Modeling and Causal Inference
