AvatarFusion: Zero-shot Generation of Clothing-Decoupled 3D Avatars Using 2D Diffusion
Shuo Huang, Zongxin Yang, Liangting Li, Yi Yang, Jia Jia

TL;DR
AvatarFusion introduces a novel zero-shot framework for generating realistic 3D avatars with decoupled clothing, leveraging a latent diffusion model and a new segmentation strategy to improve realism and customization.
Contribution
It is the first to use a latent diffusion model for pixel-level guidance and clothing segmentation in 3D avatar generation, enabling realistic and customizable avatars.
Findings
Outperforms previous methods in all evaluation metrics.
Enables clothing exchange on avatars due to decoupled clothing model.
Establishes the first benchmark for zero-shot text-to-avatar generation.
Abstract
Large-scale pre-trained vision-language models allow for the zero-shot text-based generation of 3D avatars. The previous state-of-the-art method utilized CLIP to supervise neural implicit models that reconstructed a human body mesh. However, this approach has two limitations. Firstly, the lack of avatar-specific models can cause facial distortion and unrealistic clothing in the generated avatars. Secondly, CLIP only provides optimization direction for the overall appearance, resulting in less impressive results. To address these limitations, we propose AvatarFusion, the first framework to use a latent diffusion model to provide pixel-level guidance for generating human-realistic avatars while simultaneously segmenting clothing from the avatar's body. AvatarFusion includes the first clothing-decoupled neural implicit avatar model that employs a novel Dual Volume Rendering strategy to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLatent Diffusion Model · Diffusion · Contrastive Language-Image Pre-training
