Speech-driven Animation with Meaningful Behaviors

Najmeh Sadoughi; Carlos Busso

arXiv:1708.01640·cs.HC·May 15, 2023

Speech-driven Animation with Meaningful Behaviors

Najmeh Sadoughi, Carlos Busso

PDF

TL;DR

This paper introduces a novel speech-driven animation method that combines rule-based and data-driven approaches using a dynamic Bayesian network to generate meaningful, synchronized gestures reflecting speech content.

Contribution

It proposes a constrained DBN model that incorporates discourse functions and prototypical behaviors to produce more natural and meaningful agent movements.

Findings

01

The constrained model outperforms unconstrained models in evaluations.

02

The approach effectively synchronizes gestures with speech.

03

It captures meaningful behaviors aligned with discourse and prototypical cues.

Abstract

Conversational agents (CAs) play an important role in human computer interaction. Creating believable movements for CAs is challenging, since the movements have to be meaningful and natural, reflecting the coupling between gestures and speech. Studies in the past have mainly relied on rule-based or data-driven approaches. Rule-based methods focus on creating meaningful behaviors conveying the underlying message, but the gestures cannot be easily synchronized with speech. Data-driven approaches, especially speech-driven models, can capture the relationship between speech and gestures. However, they create behaviors disregarding the meaning of the message. This study proposes to bridge the gap between these two approaches overcoming their limitations. The approach builds a dynamic Bayesian network (DBN), where a discrete variable is added to constrain the behaviors on the underlying…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.