Interpretable Models via Pairwise permutations algorithm
Troy Maaslandand, Jo\~ao Pereira, Diogo Bastos, Marcus de, Goffau, Max Nieuwdorp, Aeilko H. Zwinderman, Evgeni Levin

TL;DR
This paper introduces the pairwise permutation algorithm (PPA), a new method designed to reduce correlation bias in feature importance assessments, improving interpretability in high-dimensional biological data analysis.
Contribution
The paper presents the theoretical foundation of PPA and demonstrates its effectiveness in correcting correlation bias through toy and microbiome datasets.
Findings
PPA corrects correlation effects in feature importance.
PPA identifies biologically relevant biomarkers.
PPA improves interpretability in high-dimensional data.
Abstract
One of the most common pitfalls often found in high dimensional biological data sets are correlations between the features. This may lead to statistical and machine learning methodologies overvaluing or undervaluing these correlated predictors, while the truly relevant ones are ignored. In this paper, we will define a new method called \textit{pairwise permutation algorithm} (PPA) with the aim of mitigating the correlation bias in feature importance values. Firstly, we provide a theoretical foundation, which builds upon previous work on permutation importance. PPA is then applied to a toy data set, where we demonstrate its ability to correct the correlation effect. We further test PPA on a microbiome shotgun dataset, to show that the PPA is already able to obtain biological relevant biomarkers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Machine Learning in Bioinformatics · Bioinformatics and Genomic Networks
