TL;DR
Pixel Codec Avatars (PiCA) is a deep generative model that produces high-quality, real-time 3D face reconstructions for avatars in virtual reality, combining efficiency, adaptability, and multi-person support.
Contribution
PiCA introduces a novel fully convolutional, rendering-adaptive per-pixel decoder architecture for 3D face modeling, enabling efficient, high-quality, multi-person telecommunication in VR.
Findings
Achieves state-of-the-art face reconstruction accuracy.
Runs in real-time on a mobile VR headset.
Supports rendering of five avatars simultaneously.
Abstract
Telecommunication with photorealistic avatars in virtual or augmented reality is a promising path for achieving authentic face-to-face communication in 3D over remote physical distances. In this work, we present the Pixel Codec Avatars (PiCA): a deep generative model of 3D human faces that achieves state of the art reconstruction performance while being computationally efficient and adaptive to the rendering conditions during execution. Our model combines two core ideas: (1) a fully convolutional architecture for decoding spatially varying features, and (2) a rendering-adaptive per-pixel decoder. Both techniques are integrated via a dense surface representation that is learned in a weakly-supervised manner from low-topology mesh tracking over training images. We demonstrate that PiCA improves reconstruction over existing techniques across testing expressions and views on persons of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
