Learning a Delighting Prior for Facial Appearance Capture in the Wild
Yuxuan Han, Xin Ming, Tianxiao Li, Zhuofan Shen, Qixuan Zhang, Lan Xu, Feng Xu

TL;DR
This paper introduces a novel delighting prior trained on diverse datasets to improve in-the-wild facial appearance capture, enabling high-quality reflectance estimation from casual videos.
Contribution
It proposes a dataset-conditioned delighting network with Dataset Latent Modulation, outperforming existing models and enabling automatic, high-quality facial appearance capture from casual videos.
Findings
Outperforms prior methods in reflectance estimation accuracy.
Enables automatic facial appearance capture from casual videos.
Transforms datasets into high-resolution relightable scans.
Abstract
High-quality facial appearance capture has traditionally required costly studio recording. Recent works consider an in-the-wild smartphone-based setup; however, their model-based inverse rendering paradigm struggles with the complex disentanglement of reflectance from unknown illumination. To bridge this gap, we propose to shift the paradigm into training a powerful delighting network as a prior to constrain the optimization. We leverage the OLAT dataset and the rendered Light Stage scans for training, and propose Dataset Latent Modulation (DLM) to seamlessly integrate these heterogeneous data sources. Specifically, by conditioning the core network on learnable source-aware tokens, we decouple dataset-specific styles from physical delighting principles, enabling the emergence of a delighting prior that outperforms existing proprietary models. This powerful delighting prior enables a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
