Dual-Model Prediction of Affective Engagement and Vocal Attractiveness from Speaker Expressiveness in Video Learning

Hung-Yue Suen; Kuo-En Hung; and Fan-Hsun Tseng

arXiv:2603.18758·cs.HC·March 20, 2026

Dual-Model Prediction of Affective Engagement and Vocal Attractiveness from Speaker Expressiveness in Video Learning

Hung-Yue Suen, Kuo-En Hung, and Fan-Hsun Tseng

PDF

Open Access

TL;DR

This study presents a speaker-centric Emotion AI that predicts audience engagement and vocal attractiveness from speaker expressions in video learning, demonstrating high accuracy without using audience data.

Contribution

It introduces a dual regression model approach leveraging speaker-side multimodal features to predict audience responses, advancing privacy-preserving affective computing.

Findings

01

High predictive performance with R2=0.85 for engagement

02

High predictive performance with R2=0.88 for vocal attractiveness

03

Speaker-side features can effectively represent audience feedback

Abstract

This paper outlines a machine learning-enabled speaker-centric Emotion AI approach capable of predicting audience-affective engagement and vocal attractiveness in asynchronous video-based learning, relying solely on speaker-side affective expressions. Inspired by the demand for scalable, privacy-preserving affective computing applications, this speaker-centric Emotion AI approach incorporates two distinct regression models that leverage a massive corpus developed within Massive Open Online Courses (MOOCs) to enable affectively engaging experiences. The regression model predicting affective engagement is developed by assimilating emotional expressions emanating from facial dynamics, oculomotor features, prosody, and cognitive semantics, while incorporating a second regression model to predict vocal attractiveness based exclusively on speaker-side acoustic features. Notably, on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Social Robot Interaction and HRI · Intelligent Tutoring Systems and Adaptive Learning