Joker: Conditional 3D Head Synthesis with Extreme Facial Expressions
Malte Prinzler, Egor Zakharov, Vanessa Sklyarova, Berna Kabadayi,, Justus Thies

TL;DR
Joker is a novel method for generating 3D human heads with extreme facial expressions from a single image, utilizing multi-modal conditioning and a 3D distillation process to ensure view consistency and high expressiveness.
Contribution
It introduces a new approach combining 2D diffusion priors, 3DMM control, and a 3D distillation technique for view-consistent, expressive 3D head synthesis from minimal input.
Findings
Achieves state-of-the-art results in 3D head synthesis.
Successfully models extreme tongue articulation.
Generalizes well to out-of-domain samples like sculptures and paintings.
Abstract
We introduce Joker, a new method for the conditional synthesis of 3D human heads with extreme expressions. Given a single reference image of a person, we synthesize a volumetric human head with the reference identity and a new expression. We offer control over the expression via a 3D morphable model (3DMM) and textual inputs. This multi-modal conditioning signal is essential since 3DMMs alone fail to define subtle emotional changes and extreme expressions, including those involving the mouth cavity and tongue articulation. Our method is built upon a 2D diffusion-based prior that generalizes well to out-of-domain samples, such as sculptures, heavy makeup, and paintings while achieving high levels of expressiveness. To improve view consistency, we propose a new 3D distillation technique that converts predictions of our 2D prior into a neural radiance field (NeRF). Both the 2D prior and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Emotion and Mood Recognition
