# T2W-CogLoadNet: a framework for cognitive load assessment of dance movements based on deep learning-powered human pose estimation

**Authors:** Fei Zhao

PMC · DOI: 10.3389/fpsyg.2025.1707539 · Frontiers in Psychology · 2026-01-21

## TL;DR

This paper introduces T2W-CogLoadNet, a deep learning model that improves dance posture estimation and cognitive load assessment using advanced feature extraction and optimization techniques.

## Contribution

The novel integration of TCN-Transformer with WOA optimization for 3D dance posture estimation and cognitive load modeling.

## Key findings

- T2W-CogLoadNet outperforms HRNet and OpenPose in cognitive load estimation and 3D joint error metrics.
- The model maintains robust performance under noise and temporal scaling challenges.
- Future work includes multimodal input integration and lightweight real-time monitoring tools.

## Abstract

Dance posture estimation and cognitive load assessment are crucial for optimizing dance training outcomes and promoting rehabilitation applications. Traditional methods often suffer from problems such as reliance on subjective judgment in cognitive load assessment and insufficient modeling of dance temporal features. This study proposes a T2W-CogLoadNet model, which integrates Temporal Convolutional Network (TCN)-Transformer temporal feature extraction with Whale Optimization (WOA) hyperparameter optimization to achieve 3D dance posture estimation and cognitive load modeling (indirect measurement). In this model, TCN captures local dynamic details of dance movements, Transformer handles long-range temporal dependencies, and WOA simultaneously optimizes feature subsets and model parameters to improve performance. Experimental validation on the AIST++ professional dance dataset and the Kinetics 400 generalized motion dataset demonstrates that the model significantly outperforms baseline models such as High Resolution Network (HRNet) and OpenPose estimation. On the AIST++ dataset, its mean absolute error (MAE) for cognitive load estimation reaches 0.23, root mean square error (RMSE) reaches 0.26, and mean mean joint error (MPJPE) for 3D joints reaches 0.45. On the Kinetics 400 dataset, MAE, RMSE, and MPJPE reach 0.25, 0.28, and 0.48, respectively. Even under interference scenarios such as noise injection and temporal scaling, the model maintains robust performance, with its MAE consistently lower than the aforementioned baseline models. Future research will focus on integrating multimodal inputs to improve assessment reliability, enhancing the model's adaptability to different dance styles, and developing lightweight real-time monitoring tools to promote the widespread application of this technology in dance education and rehabilitation.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12867920/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12867920/full.md

## References

52 references — full list in the complete paper: https://tomesphere.com/paper/PMC12867920/full.md

---
Source: https://tomesphere.com/paper/PMC12867920