Multimodal Learning For Classroom Activity Detection

Hang Li; Yu Kang; Wenbiao Ding; Song Yang; Songfan Yang; Gale Yan; Huang; Zitao Liu

arXiv:1910.13799·eess.AS·February 12, 2020·1 cites

Multimodal Learning For Classroom Activity Detection

Hang Li, Yu Kang, Wenbiao Ding, Song Yang, Songfan Yang, Gale Yan, Huang, Zitao Liu

PDF

Open Access

TL;DR

This paper introduces a novel attention-based neural framework for multimodal classroom activity detection that effectively classifies speaking roles and utterance lengths without relying on additional recording devices, improving accuracy across diverse classroom settings.

Contribution

The paper presents a device-free, attention-based neural framework that fuses speech and language modalities for classroom activity detection, outperforming existing methods.

Findings

01

Outperforms state-of-the-art baselines in classroom activity detection

02

Effective fusion of speech and language modalities using attention mechanisms

03

Device-free approach applicable to various classroom recordings

Abstract

Classroom activity detection (CAD) focuses on accurately classifying whether the teacher or student is speaking and recording both the length of individual utterances during a class. A CAD solution helps teachers get instant feedback on their pedagogical instructions. This greatly improves educators' teaching skills and hence leads to students' achievement. However, CAD is very challenging because (1) the CAD model needs to be generalized well enough for different teachers and students; (2) data from both vocal and language modalities has to be wisely fused so that they can be complementary; and (3) the solution shouldn't heavily rely on additional recording device. In this paper, we address the above challenges by using a novel attention based neural framework. Our framework not only extracts both speech and language information, but utilizes attention mechanism to capture long-term…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Online Learning and Analytics · Subtitles and Audiovisual Media