HeadsetOff: Enabling Photorealistic Video Conferencing on Economical VR Headsets
Yili Jin, Xize Duan, Fangxin Wang, Xue Liu

TL;DR
HeadsetOff enables photorealistic video conferencing on affordable VR headsets by using voice and motion cues for face reconstruction, balancing quality and latency.
Contribution
The paper introduces HeadsetOff, a system that reconstructs realistic user faces in VR conferencing using multimodal prediction and adaptive control, addressing accessibility limitations.
Findings
Achieves high-quality, low-latency video conferencing
Effectively predicts user behavior for face animation
Balances video quality and delay dynamically
Abstract
Virtual Reality (VR) has become increasingly popular for remote collaboration, but video conferencing poses challenges when the user's face is covered by the headset. Existing solutions have limitations in terms of accessibility. In this paper, we propose HeadsetOff, a novel system that achieves photorealistic video conferencing on economical VR headsets by leveraging voice-driven face reconstruction. HeadsetOff consists of three main components: a multimodal predictor, a generator, and an adaptive controller. The predictor effectively predicts user future behavior based on different modalities. The generator employs voice, head motion, and eye blink to animate the human face. The adaptive controller dynamically selects the appropriate generator model based on the trade-off between video quality and delay. Experimental results demonstrate the effectiveness of HeadsetOff in achieving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVirtual Reality Applications and Impacts · Augmented Reality Applications
