MagicAvatar: Multimodal Avatar Generation and Animation

Jianfeng Zhang; Hanshu Yan; Zhongcong Xu; Jiashi Feng; Jun; Hao Liew

arXiv:2308.14748·cs.GR·August 29, 2023·5 cites

MagicAvatar: Multimodal Avatar Generation and Animation

Jianfeng Zhang, Hanshu Yan, Zhongcong Xu, Jiashi Feng, Jun, Hao Liew

PDF

Open Access

TL;DR

MagicAvatar introduces a two-stage framework for multimodal avatar video generation and animation, disentangling motion control from video synthesis to enhance flexibility and enable avatar animation from images or multimodal inputs.

Contribution

It proposes a novel two-stage approach for multimodal avatar generation, explicitly separating motion control from video synthesis, and supports avatar animation from images.

Findings

01

Effective multimodal-to-motion translation demonstrated.

02

High-quality avatar video generation achieved.

03

Flexible avatar animation from images shown.

Abstract

This report presents MagicAvatar, a framework for multimodal video generation and animation of human avatars. Unlike most existing methods that generate avatar-centric videos directly from multimodal inputs (e.g., text prompts), MagicAvatar explicitly disentangles avatar video generation into two stages: (1) multimodal-to-motion and (2) motion-to-video generation. The first stage translates the multimodal inputs into motion/ control signals (e.g., human pose, depth, DensePose); while the second stage generates avatar-centric video guided by these motion signals. Additionally, MagicAvatar supports avatar animation by simply providing a few images of the target person. This capability enables the animation of the provided human identity according to the specific motion derived from the first stage. We demonstrate the flexibility of MagicAvatar through various applications, including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Human Pose and Action Recognition · Multimodal Machine Learning Applications