TL;DR
This paper introduces VoluMe, a real-time method for creating authentic 3D reconstructions from 2D webcam videos, enabling realistic and stable 3D video calls without complex hardware.
Contribution
It presents the first real-time 3D Gaussian reconstruction technique from a single webcam feed that maintains authenticity and stability, advancing accessible 3D videoconferencing.
Findings
Achieves state-of-the-art visual quality and stability in 3D reconstructions.
Enables live 3D video calls using only standard 2D cameras.
Demonstrates practical application in real-time 3D meetings.
Abstract
Virtual 3D meetings offer the potential to enhance copresence, increase engagement and thus improve effectiveness of remote meetings compared to standard 2D video calls. However, representing people in 3D meetings remains a challenge; existing solutions achieve high quality by using complex hardware, making use of fixed appearance via enrolment, or by inverting a pre-trained generative model. These approaches lead to constraints that are unwelcome and ill-fitting for videoconferencing applications. We present the first method to predict 3D Gaussian reconstructions in real time from a single 2D webcam feed, where the 3D representation is not only live and realistic, but also authentic to the input video. By conditioning the 3D representation on each video frame independently, our reconstruction faithfully recreates the input video from the captured viewpoint (a property we call…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
VoluMe: Authentic 3D Video Calls from Live Gaussian Splat Prediction· youtube
