Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions

Shaoxu Li

arXiv:2306.02903·cs.CV·June 6, 2023·2 cites

Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions

Shaoxu Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel method for creating and editing photo-realistic 3D avatars from short videos and text instructions, leveraging diffusion models and neural radiance fields to produce animatable avatars.

Contribution

It presents a new approach combining diffusion models and neural radiance fields for text-guided avatar editing and synthesis from monocular videos.

Findings

01

Outperforms existing state-of-the-art methods in quality and realism.

02

Produces animatable 3D neural head avatars with high fidelity.

03

Enables flexible editing of avatars based on textual instructions.

Abstract

We propose a method for synthesizing edited photo-realistic digital avatars with text instructions. Given a short monocular RGB video and text instructions, our method uses an image-conditioned diffusion model to edit one head image and uses the video stylization method to accomplish the editing of other head images. Through iterative training and update (three times or more), our method synthesizes edited photo-realistic animatable 3D neural head avatars with a deformable neural radiance field head synthesis method. In quantitative and qualitative studies on various subjects, our method outperforms state-of-the-art methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lsx0101/instruct-video2avatar
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Human Motion and Animation

MethodsDiffusion