DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars
Tobias Kirschstein, Simon Giebenhain, Matthias Nie{\ss}ner

TL;DR
DiffusionAvatars introduces a novel diffusion-based neural rendering method for creating high-fidelity, controllable 3D head avatars that maintain temporal consistency across poses and expressions.
Contribution
The paper presents a deferred diffusion approach leveraging a neural parametric head model and cross-attention conditioning to improve 3D head avatar synthesis.
Findings
Produces high-quality, temporally consistent 3D head avatars
Outperforms existing methods in visual quality and consistency
Effective in self-reenactment and animation scenarios
Abstract
DiffusionAvatars synthesizes a high-fidelity 3D head avatar of a person, offering intuitive control over both pose and expression. We propose a diffusion-based neural renderer that leverages generic 2D priors to produce compelling images of faces. For coarse guidance of the expression and head pose, we render a neural parametric head model (NPHM) from the target viewpoint, which acts as a proxy geometry of the person. Additionally, to enhance the modeling of intricate facial expressions, we condition DiffusionAvatars directly on the expression codes obtained from NPHM via cross-attention. Finally, to synthesize consistent surface details across different viewpoints and expressions, we rig learnable spatial features to the head's surface via TriPlane lookup in NPHM's canonical space. We train DiffusionAvatars on RGB videos and corresponding fitted NPHM meshes of a person and test the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis
