Dyadic Interaction Modeling for Social Behavior Generation
Minh Tran, Di Chang, Maksim Siniukov, Mohammad Soleymani

TL;DR
This paper introduces a novel dyadic interaction modeling framework that jointly learns speaker and listener behaviors to generate realistic 3D facial motions in social interactions, advancing the realism and diversity of generated nonverbal behaviors.
Contribution
The paper presents Dyadic Interaction Modeling (DIM), a pre-training approach using masking and contrastive learning to capture dyadic context for nonverbal behavior generation.
Findings
Outperforms existing methods in generating diverse, realistic listener motions
Establishes new state-of-the-art in motion diversity and realism metrics
Qualitative results show improved expression, eye blinks, and head gestures
Abstract
Human-human communication is like a delicate dance where listeners and speakers concurrently interact to maintain conversational dynamics. Hence, an effective model for generating listener nonverbal behaviors requires understanding the dyadic context and interaction. In this paper, we present an effective framework for creating 3D facial motions in dyadic interactions. Existing work consider a listener as a reactive agent with reflexive behaviors to the speaker's voice and facial motions. The heart of our framework is Dyadic Interaction Modeling (DIM), a pre-training approach that jointly models speakers' and listeners' motions through masking and contrastive learning to learn representations that capture the dyadic context. To enable the generation of non-deterministic behaviors, we encode both listener and speaker motions into discrete latent representations, through VQ-VAE. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInnovative Human-Technology Interaction · Digital Mental Health Interventions · Data Visualization and Analytics
MethodsContrastive Learning · VQ-VAE
