Bridging the Gap: Studio-like Avatar Creation from a Monocular Phone Capture
ShahRukh Athar, Shunsuke Saito, Zhengyu Yang, Stanislav Pidhorsky,, Chen Cao

TL;DR
This paper introduces a novel method to generate studio-like, photorealistic facial textures for avatars from simple monocular phone captures by leveraging StyleGAN2 and diffusion models, significantly improving detail and lighting realism.
Contribution
It proposes a new approach that uses StyleGAN2's $W^+$ space and diffusion super-resolution to produce high-quality, studio-like textures from casual phone videos, bridging the quality gap.
Findings
Produces photorealistic, uniformly lit avatars from monocular captures
Enhances facial detail accuracy with diffusion super-resolution
Achieves near-studio quality textures from casual phone videos
Abstract
Creating photorealistic avatars for individuals traditionally involves extensive capture sessions with complex and expensive studio devices like the LightStage system. While recent strides in neural representations have enabled the generation of photorealistic and animatable 3D avatars from quick phone scans, they have the capture-time lighting baked-in, lack facial details and have missing regions in areas such as the back of the ears. Thus, they lag in quality compared to studio-captured avatars. In this paper, we propose a method that bridges this gap by generating studio-like illuminated texture maps from short, monocular phone captures. We do this by parameterizing the phone texture maps using the space of a StyleGAN2, enabling near-perfect reconstruction. Then, we finetune a StyleGAN2 by sampling in the parameterized space using a very small set of studio-captured…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAugmented Reality Applications · Educational Games and Gamification
MethodsHuMan(Expedia)||How do I get a human at Expedia? · Sparse Evolutionary Training · Path Length Regularization · Weight Demodulation · R1 Regularization · Convolution · StyleGAN2 · Diffusion
