Statistical quantification of confounding bias in predictive modelling

Tamas Spisak

arXiv:2111.00814·cs.LG·May 30, 2025·1 cites

Statistical quantification of confounding bias in predictive modelling

Tamas Spisak

PDF

Open Access 2 Repos

TL;DR

This paper introduces new non-parametric statistical tests to detect confounding bias in predictive models, improving robustness and validity especially in neuroimaging data.

Contribution

It proposes the partial and full confounder tests, offering strict control of Type I errors and high power for complex dependencies, with implementation in the mlconfound package.

Findings

01

Identified previously unreported confounders in brain connectivity data.

02

Demonstrated the tests' effectiveness in real neuroimaging datasets.

03

Enhanced model validity and generalizability through confound detection.

Abstract

The lack of non-parametric statistical tests for confounding bias significantly hampers the development of robust, valid and generalizable predictive models in many fields of research. Here I propose the partial and full confounder tests, which, for a given confounder variable, probe the null hypotheses of unconfounded and fully confounded models, respectively. The tests provide a strict control for Type I errors and high statistical power, even for non-normally and non-linearly dependent predictions, often seen in machine learning. Applying the proposed tests on models trained on functional brain connectivity data from the Human Connectome Project and the Autism Brain Imaging Data Exchange dataset reveals confounders that were previously unreported or found to be hard to correct for with state-of-the-art confound mitigation approaches. The tests, implemented in the package mlconfound…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFunctional Brain Connectivity Studies · Health, Environment, Cognitive Aging · Autism Spectrum Disorder Research