Multimodal Emotion Coupling via Speech-to-Facial and Bodily Gestures in Dyadic Interaction
Von Ralph Dane Marquez Herbuela, Yukie Nagai

TL;DR
This paper investigates how emotional speech and gestures are coordinated in dyadic interactions, analyzing multimodal data to improve understanding of emotional communication and enhance real-time emotion detection systems.
Contribution
It introduces a detailed analysis of multimodal emotion coupling using motion capture data, revealing how speech and gestures synchronize across different emotional states and interaction contexts.
Findings
Nonoverlapping speech increases facial and mouth activeness.
Sadness correlates with increased expressivity during nonoverlap.
Hand gesture synchrony is higher under low arousal and overlapping speech.
Abstract
Human emotional expression emerges through coordinated vocal, facial, and gestural signals. While speech face alignment is well established, the broader dynamics linking emotionally expressive speech to regional facial and hand motion remains critical for gaining a deeper insight into how emotional and behavior cues are communicated in real interactions. Further modulating the coordination is the structure of conversational exchange like sequential turn taking, which creates stable temporal windows for multimodal synchrony, and simultaneous speech, often indicative of high arousal moments, disrupts this alignment and impacts emotional clarity. Understanding these dynamics enhances realtime emotion detection by improving the accuracy of timing and synchrony across modalities in both human interactions and AI systems. This study examines multimodal emotion coupling using region specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAction Observation and Synchronization · Emotion and Mood Recognition · Face Recognition and Perception
