One Shot, One Talk: Whole-body Talking Avatar from a Single Image

Jun Xiang; Yudong Guo; Leipeng Hu; Boyang Guo; Yancheng Yuan; Juyong; Zhang

arXiv:2412.01106·cs.CV·December 3, 2024

One Shot, One Talk: Whole-body Talking Avatar from a Single Image

Jun Xiang, Yudong Guo, Leipeng Hu, Boyang Guo, Yancheng Yuan, Juyong, Zhang

PDF

Open Access

TL;DR

This paper introduces a novel pipeline for creating realistic, animatable whole-body talking avatars from a single image, addressing dynamic modeling and generalization challenges with a hybrid 3D mesh representation and diffusion models.

Contribution

It presents a new method combining pose-guided diffusion and a 3D mesh hybrid model to generate and animate avatars from a single image, improving realism and control.

Findings

01

Enables creation of photorealistic avatars from one image

02

Achieves precise animation of gestures and expressions

03

Demonstrates robustness across diverse subjects

Abstract

Building realistic and animatable avatars still requires minutes of multi-view or monocular self-rotating videos, and most methods lack precise control over gestures and expressions. To push this boundary, we address the challenge of constructing a whole-body talking avatar from a single image. We propose a novel pipeline that tackles two critical issues: 1) complex dynamic modeling and 2) generalization to novel gestures and expressions. To achieve seamless generalization, we leverage recent pose-guided image-to-video diffusion models to generate imperfect video frames as pseudo-labels. To overcome the dynamic modeling challenge posed by inconsistent and noisy pseudo-videos, we introduce a tightly coupled 3DGS-mesh hybrid avatar representation and apply several key regularizations to mitigate inconsistencies caused by imperfect labels. Extensive experiments on diverse subjects…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVirtual Reality Applications and Impacts · Augmented Reality Applications

MethodsDiffusion