LightAvatar: Efficient Head Avatar as Dynamic Neural Light Field
Huan Wang, Feitong Tan, Ziqian Bai, Yinda Zhang, Shichen, Liu, Qiangeng Xu, Menglei Chai, Anish Prabhu, Rohit Pandey and, Sean Fanello, Zeng Huang, Yun Fu

TL;DR
LightAvatar introduces a neural light field-based head avatar model that achieves real-time rendering speeds and high image quality from monocular videos, suitable for resource-constrained devices.
Contribution
The paper presents the first neural light field head avatar model that enables fast, high-quality rendering without mesh or volume rendering, using a novel network design and distillation training strategy.
Findings
Achieves 174.1 FPS rendering speed on a consumer GPU.
Outperforms previous methods in image quality both quantitatively and qualitatively.
Operates efficiently without customized optimization.
Abstract
Recent works have shown that neural radiance fields (NeRFs) on top of parametric models have reached SOTA quality to build photorealistic head avatars from a monocular video. However, one major limitation of the NeRF-based avatars is the slow rendering speed due to the dense point sampling of NeRF, preventing them from broader utility on resource-constrained devices. We introduce LightAvatar, the first head avatar model based on neural light fields (NeLFs). LightAvatar renders an image from 3DMM parameters and a camera pose via a single network forward pass, without using mesh or volume rendering. The proposed approach, while being conceptually appealing, poses a significant challenge towards real-time efficiency and training stability. To resolve them, we introduce dedicated network designs to obtain proper representations for the NeLF model and maintain a low FLOPs budget. Meanwhile,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Video Surveillance and Tracking Methods · Human Pose and Action Recognition
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
