Envisioning a Next Generation Extended Reality Conferencing System with Efficient Photorealistic Human Rendering
Chuanyue Shen, Letian Zhang, Zhangsihao Yang, Masood Mortazavi, Xiyun, Song, Liang Peng, Heather Yu

TL;DR
This paper proposes an accelerated NeRF-based pipeline for photorealistic human rendering in extended reality conferencing, significantly improving rendering speed while maintaining quality to enable immersive, multi-user metaverse meetings.
Contribution
It introduces a novel accelerated NeRF algorithm for real-time photorealistic human rendering, addressing speed limitations of existing neural rendering methods for virtual meetings.
Findings
Training speed improved by 44.5%
Inference speed increased by 213%
Maintains comparable rendering quality to state-of-the-art methods
Abstract
Meeting online is becoming the new normal. Creating an immersive experience for online meetings is a necessity towards more diverse and seamless environments. Efficient photorealistic rendering of human 3D dynamics is the core of immersive meetings. Current popular applications achieve real-time conferencing but fall short in delivering photorealistic human dynamics, either due to limited 2D space or the use of avatars that lack realistic interactions between participants. Recent advances in neural rendering, such as the Neural Radiance Field (NeRF), offer the potential for greater realism in metaverse meetings. However, the slow rendering speed of NeRF poses challenges for real-time conferencing. We envision a pipeline for a future extended reality metaverse conferencing system that leverages monocular video acquisition and free-viewpoint synthesis to enhance data and hardware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Virtual Reality Applications and Impacts · Computer Graphics and Visualization Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
