# Generalised machine learning models outperform personalised models for cognitive load classification in real-life settings

**Authors:** Christoph Anders, Ipsita Bhaduri, Bert Arnrich

PMC · DOI: 10.3389/fdgth.2025.1650085 · Frontiers in Digital Health · 2025-10-06

## TL;DR

This study shows that general machine learning models perform well in classifying cognitive load using wearable sensors, even in real-life settings.

## Contribution

The study introduces a balanced data collection approach combining controlled and uncontrolled environments and achieves high classification performance.

## Key findings

- Generalized models like Logistic Regression achieved F1 scores up to 0.91 for two-class cognitive load classification.
- Differences in smartwatch indices and biomarkers were observed between low- and high-load scenarios.
- The study design and anonymized dataset were made publicly available.

## Abstract

By issuing work-break reminders, for example, personal assistants for cognitive load could be beneficial in maintaining health and life satisfaction in society. Wearable sensors facilitate the necessary real-time collection of physiological data. Still, publicly available real-life data sets obtained with wearable sensors are scarce, especially considering multi-modal recordings. Furthermore, data is usually recorded in either completely controlled or uncontrolled environments, missing the opportunity to study participants across optimal laboratory and realistic real-life settings.

This work collected data from ten university students during given and self-chosen cognitive load tasks, resembling typical working environments from over 40% of the OECD population, and investigated if commercially available sensors suffice for building cognitive load assistants. The study design accounted for a balanced distribution of eight working hours per participant, split between controlled and uncontrolled environments.

Across participants, no single feature correlated significantly with cognitive load, but differences in smartwatch indices and biomarkers were identified between low- and high-load scenarios. Generalised machine learning models like Logistic Regression achieved F1 scores of up to 0.91, 0.77, and 0.54 for two, three, and five-class classification, respectively.

The presented study design marks a step towards real-life mental state assistants, and the anonymised dataset was made publicly available.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12536347/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12536347/full.md

## References

51 references — full list in the complete paper: https://tomesphere.com/paper/PMC12536347/full.md

---
Source: https://tomesphere.com/paper/PMC12536347