# Design and analysis of teaching early warning system based on multimodal data in an intelligent learning environment

**Authors:** Xinxin Kang, Yong Nie

PMC · DOI: 10.7717/peerj-cs.2692 · PeerJ Computer Science · 2025-03-04

## TL;DR

This paper introduces an early warning system for online teaching that uses multimodal data to detect and analyze teachers' emotions in instructional videos.

## Contribution

The paper proposes novel algorithms for emotional segmentation, normalization, and a deep learning model with attention for multimodal emotion recognition in teaching.

## Key findings

- An efficient algorithm for identifying emotional transition points in long instructional videos was developed.
- A neutral emotional segment filtering method based on facial features improved emotional recognition accuracy.
- The multimodal model with attention mechanism achieved accurate emotional classification in teacher videos.

## Abstract

In online teaching environments, the lack of direct emotional interaction between teachers and students poses challenges for teachers to consciously and effectively manage their emotional expressions. The design and implementation of an early warning system for teaching provide a novel approach to intelligent evaluation and improvement of online education. This study focuses on segmenting different emotional segments and recognizing emotions in instructional videos. An efficient long-video emotional transition point search algorithm is proposed for segmenting video emotional segments. Leveraging the fact that teachers tend to maintain a neutral emotional state for significant portions of their teaching, a neutral emotional segment filtering algorithm based on facial features has been designed. A multimodal emotional recognition model is proposed for emotional recognition in instructional videos. It begins with preprocessing the raw speech and facial image features, employing a semi-supervised iterative feature normalization algorithm to eliminate individual teacher differences while preserving inherent differences between different emotions. A deep learning-based multimodal emotional recognition model for teacher instructional videos is introduced, incorporating an attention mechanism to automatically assign weights for feature-level modal fusion, providing users with accurate emotional classification. Finally, a teaching early warning system is implemented based on these algorithms.

## Full-text entities

- **Diseases:** COVID-19 (MESH:D000086382), anxiety (MESH:D001007)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11888907/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11888907/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/PMC11888907/full.md

---
Source: https://tomesphere.com/paper/PMC11888907