Spatiotemporal multimodal emotion recognition using Temporal video sequences and pose features for child emotion classification
S K B Sangeetha, Raja Sarath Kumar Boddu, Amiya Bhaumik, Sandeep Kumar Mathivanan, Usha Moorthy

TL;DR
This paper introduces a new method for recognizing children's emotions using video and pose data, achieving high accuracy and better performance than existing models.
Contribution
A novel spatiotemporal multimodal emotion recognition network (ST-MERN) for child emotion classification using pose and temporal features.
Findings
The proposed model achieved 93.6% validation accuracy and 94.3% test accuracy using a BiLSTM architecture.
The TCN model offered real-time performance with 91.7% test accuracy and 0.8-second inference times.
The system effectively captures dynamic emotional nuances in children with stable pose data and low feature variability.
Abstract
Developmental psychology and affective computing have placed great emphasis on identifying children’s emotional cues in recent times. In this study, a novel Spatio-Temporal Multimodal Emotion Recognition Network (ST-MERN) for child emotion classification is proposed. Dense feature embeddings of the EmoReact dataset and temporal video sequences are utilized for the study. The proposed method uses 115 continuous frames per visual signal instance, e.g., rotational-translational vectors, facial keypoints, and pose predictions. With steady performance on each frame and a mean confidence of 0.967, this ensures the system maintains good detection fidelity. In order to track subtle emotional changes, our method captures dynamic data like scale variation and frame-to-frame variation (rx, ry, rz, tx, ty). Latent features (p24–p33) provide a profound explanation of emotional states. The model is…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Infant Health and Development · Child Development and Digital Technology
