AgentAvatar: Disentangling Planning, Driving and Rendering for   Photorealistic Avatar Agents

Duomin Wang; Bin Dai; Yu Deng; Baoyuan Wang

arXiv:2311.17465·cs.CV·December 5, 2023·1 cites

AgentAvatar: Disentangling Planning, Driving and Rendering for Photorealistic Avatar Agents

Duomin Wang, Bin Dai, Yu Deng, Baoyuan Wang

PDF

Open Access

TL;DR

This paper introduces AgentAvatar, a framework that combines large language models and neural rendering to generate realistic, interactive avatar agents capable of nuanced facial animations from high-level inputs.

Contribution

It presents a novel disentangled pipeline that separates planning, driving, and rendering, enabling flexible and realistic avatar animation from high-level descriptions.

Findings

01

Effective in generating photorealistic avatar animations

02

Versatile across monadic and dyadic interactions

03

Validated on multiple datasets

Abstract

In this study, our goal is to create interactive avatar agents that can autonomously plan and animate nuanced facial movements realistically, from both visual and behavioral perspectives. Given high-level inputs about the environment and agent profile, our framework harnesses LLMs to produce a series of detailed text descriptions of the avatar agents' facial motions. These descriptions are then processed by our task-agnostic driving engine into motion token sequences, which are subsequently converted into continuous motion embeddings that are further consumed by our standalone neural-based renderer to generate the final photorealistic avatar animations. These streamlined processes allow our framework to adapt to a variety of non-verbal avatar interactions, both monadic and dyadic. Our extensive study, which includes experiments on both newly compiled and existing datasets featuring two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Human Motion and Animation · 3D Shape Modeling and Analysis