To React or not to React: End-to-End Visual Pose Forecasting for   Personalized Avatar during Dyadic Conversations

Chaitanya Ahuja; Shugao Ma; Louis-Philippe Morency; Yaser Sheikh

arXiv:1910.02181·cs.CV·October 8, 2019

To React or not to React: End-to-End Visual Pose Forecasting for Personalized Avatar during Dyadic Conversations

Chaitanya Ahuja, Shugao Ma, Louis-Philippe Morency, Yaser Sheikh

PDF

3 Repos

TL;DR

This paper presents DRAM, a neural model that predicts natural avatar body poses during conversations by modeling both individual and interactive behaviors using adaptive attention, improving realism and interaction quality.

Contribution

Introduces DRAM, a novel neural architecture that combines intrapersonal and interpersonal dynamics with adaptive attention for end-to-end pose forecasting in avatars during dyadic conversations.

Findings

01

DRAM outperforms non-adaptive models in naturalness of generated poses.

02

Adaptive attention effectively captures interpersonal dynamics.

03

User study confirms improved realism of avatar behaviors.

Abstract

Non verbal behaviours such as gestures, facial expressions, body posture, and para-linguistic cues have been shown to complement or clarify verbal messages. Hence to improve telepresence, in form of an avatar, it is important to model these behaviours, especially in dyadic interactions. Creating such personalized avatars not only requires to model intrapersonal dynamics between a avatar's speech and their body pose, but it also needs to model interpersonal dynamics with the interlocutor present in the conversation. In this paper, we introduce a neural architecture named Dyadic Residual-Attention Model (DRAM), which integrates intrapersonal (monadic) and interpersonal (dyadic) dynamics using selective attention to generate sequences of body pose conditioned on audio and body pose of the interlocutor and audio of the human operating the avatar. We evaluate our proposed model on dyadic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.