Text and Image Guided 3D Avatar Generation and Manipulation

Zehranaz Canfes; M. Furkan Atasoy; Alara Dirik; Pinar Yanardag

arXiv:2202.06079·cs.CV·February 15, 2022·1 cites

Text and Image Guided 3D Avatar Generation and Manipulation

Zehranaz Canfes, M. Furkan Atasoy, Alara Dirik, Pinar Yanardag

PDF

Open Access 1 Repo 2 Videos

TL;DR

This paper introduces a novel method for manipulating 3D face avatars' shape and texture using text or image prompts, leveraging CLIP and a pre-trained 3D GAN within a differentiable rendering pipeline, achieving efficient and targeted modifications.

Contribution

It presents a new 3D manipulation technique that controls shape and texture via prompts, combining CLIP with 3D GANs for efficient avatar editing.

Findings

01

Manipulation takes only 5 minutes per instance.

02

Effective control of shape and texture using prompts.

03

Demonstrated superior results through extensive comparisons.

Abstract

The manipulation of latent space has recently become an interesting topic in the field of generative models. Recent research shows that latent directions can be used to manipulate images towards certain attributes. However, controlling the generation process of 3D generative models remains a challenge. In this work, we propose a novel 3D manipulation method that can manipulate both the shape and texture of the model using text or image-based prompts such as 'a young face' or 'a surprised face'. We leverage the power of Contrastive Language-Image Pre-training (CLIP) model and a pre-trained 3D GAN model designed to generate face avatars, and create a fully differentiable rendering pipeline to manipulate meshes. More specifically, our method takes an input latent code and modifies it such that the target attribute specified by a text or image prompt is present or enhanced, while leaving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

catlab-team/latent3D_code
pytorchOfficial

Videos

Text and Image Guided 3D Avatar Generation and Manipulation· youtube

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Human Motion and Animation