Multimodal Conditional 3D Face Geometry Generation

Christopher Otto; Prashanth Chandran; Sebastian Weiss; Markus Gross; Gaspard Zoss; Derek Bradley

arXiv:2407.01074·cs.CV·September 3, 2025

Multimodal Conditional 3D Face Geometry Generation

Christopher Otto, Prashanth Chandran, Sebastian Weiss, Markus Gross, Gaspard Zoss, Derek Bradley

PDF

Open Access

TL;DR

This paper introduces a versatile diffusion-based method for multimodal conditional 3D face generation, enabling user control over identity and expression from various input signals within a single model.

Contribution

It presents a novel diffusion approach with cross-attention for integrating multiple conditioning signals in 3D face generation, offering high-quality, topology-consistent results.

Findings

01

Generates 3D faces from sketches, photos, edges, parameters, landmarks, or text.

02

Provides fine-grain user control over identity and expression.

03

Produces high-quality, topology-consistent 3D face geometries.

Abstract

We present a new method for multimodal conditional 3D face geometry generation that allows user-friendly control over the output identity and expression via a number of different conditioning signals. Within a single model, we demonstrate 3D faces generated from artistic sketches, portrait photos, Canny edges, FLAME face model parameters, 2D face landmarks, or text prompts. Our approach is based on a diffusion process that generates 3D geometry in a 2D parameterized UV domain. Geometry generation passes each conditioning signal through a set of cross-attention layers (IP-Adapter), one set for each user-defined conditioning signal. The result is an easy-to-use 3D face generation tool that produces topology-consistent, high-quality geometry with fine-grain user control.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Human Motion and Animation · 3D Shape Modeling and Analysis

MethodsSparse Evolutionary Training · Diffusion