Detecting User Engagement in Everyday Conversations
Chen Yu, Paul M. Aoki, Allison Woodruff

TL;DR
This paper introduces a novel approach for estimating conversational engagement in speech by combining emotion recognition with coupled hidden Markov models, effectively capturing the temporal dynamics of interaction.
Contribution
It proposes a multilevel model integrating SVM-based emotion classification with HMMs to assess engagement in natural speech, advancing beyond simple emotion detection.
Findings
Effective estimation of engagement levels from speech data.
Improved modeling of conversational dynamics.
Successful application to real speech corpora.
Abstract
This paper presents a novel application of speech emotion recognition: estimation of the level of conversational engagement between users of a voice communication system. We begin by using machine learning techniques, such as the support vector machine (SVM), to classify users' emotions as expressed in individual utterances. However, this alone fails to model the temporal and interactive aspects of conversational engagement. We therefore propose the use of a multilevel structure based on coupled hidden Markov models (HMM) to estimate engagement levels in continuous natural speech. The first level is comprised of SVM-based classifiers that recognize emotional states, which could be (e.g.) discrete emotion types or arousal/valence levels. A high-level HMM then uses these emotional states as input, estimating users' engagement in conversation by decoding the internal states of the HMM. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining · AI in Service Interactions
