# Pitfalls in using ML to predict cognitive function performance

**Authors:** Gianna Kuhles, Sami Hamdan, Stefan Heim, Simon B. Eickhoff, Kaustubh R. Patil, Julia A. Camilleri, Susanne Weis

PMC · DOI: 10.1038/s41598-025-24325-9 · Scientific Reports · 2025-10-29

## TL;DR

This paper warns about the risks of confounding variables in machine learning models predicting cognitive function, using a case study on speech features and executive function.

## Contribution

The study highlights the issue of confound leakage in ML predictions of cognitive performance and provides a practical example.

## Key findings

- Prediction of EF performance showed inflated accuracy due to confound leakage.
- Confounding variables like age, sex, and education strongly influenced the results.
- Controlling for confounds is essential in ML pipelines for cognitive predictions.

## Abstract

Machine learning analyses are widely used for predicting cognitive abilities, yet there are pitfalls that need to be considered during their implementation and interpretation of the results. Hence, the present study aimed at drawing attention to the risks of erroneous conclusions incurred by confounding variables illustrated by a case example predicting executive function (EF) performance by prosodic features. Healthy participants (n = 231) performed speech tasks and EF tests. From 264 prosodic features, we predicted EF performance using 66 variables, controlling for confounding effects of age, sex, and education. A reasonable prediction performance was apparently achieved for EF variables of the Trail Making Test. However, in-depth analyses revealed indications of confound leakage, leading to inflated prediction accuracies, due to a strong relationship between confounds and targets. These findings highlight the need to control confounding variables in ML pipelines and caution against potential pitfalls in ML predictions.

The online version contains supplementary material available at 10.1038/s41598-025-24325-9.

## Full-text entities

- **Genes:** TPSG1 (tryptase gamma 1) [NCBI Gene 25823] {aka PRSS31, TMT, trpA}
- **Diseases:** neurological and psychiatric disorders (MESH:D001523), neurodegenerative diseases (MESH:D019636), impaired language function (MESH:D007806), Foreign Accent Syndrome (MESH:D000081042), neurological or mental impairment (MESH:D009422), frontotemporal dementia (MESH:D057180), aphasia (MESH:D001037), Autism Spectrum Disorder (MESH:D000067877), depressive disorders (MESH:D003866), impaired working memory (MESH:D008569), impaired prosodic skills (MESH:D019957), function (MESH:D003291), cognitive decline (MESH:D003072), frontal brain damage (MESH:D001927), ML (MESH:D007859), impairment in prosody (MESH:D060825), verbal reasoning impairments (MESH:D001039), dysarthria (MESH:D004401), Parkinson's Disease (MESH:D010300)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12572133/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12572133/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/PMC12572133/full.md

---
Source: https://tomesphere.com/paper/PMC12572133