Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images
Cuican Yu, Guansong Lu, Yihan Zeng, Jian Sun, Xiaodan Liang, Huibin, Li, Zongben Xu, Songcen Xu, Wei Zhang, Hang Xu

TL;DR
This paper introduces TG-3DFace, a novel method for generating realistic 3D faces from text descriptions using only 2D face data, with techniques to ensure semantic consistency and high-quality outputs.
Contribution
The paper proposes a text-guided 3D face generation framework that learns from 2D face data and introduces cross-modal alignment and classifier guidance for improved realism and diversity.
Findings
Boosts 9% multi-view consistency over Latent3D
Achieves higher FID and CLIP scores than 2D face/image models
Generates more realistic and semantically consistent 3D faces
Abstract
Generating 3D faces from textual descriptions has a multitude of applications, such as gaming, movie, and robotics. Recent progresses have demonstrated the success of unconditional 3D face generation and text-to-3D shape generation. However, due to the limited text-3D face data pairs, text-driven 3D face generation remains an open problem. In this paper, we propose a text-guided 3D faces generation method, refer as TG-3DFace, for generating realistic 3D faces using text guidance. Specifically, we adopt an unconditional 3D face generation framework and equip it with text conditions, which learns the text-guided 3D face generation with only text-2D face data. On top of that, we propose two text-to-face cross-modal alignment techniques, including the global contrastive learning and the fine-grained alignment module, to facilitate high semantic consistency between generated 3D faces and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images· youtube
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis
MethodsContrastive Learning · Contrastive Language-Image Pre-training
