Attention based Occlusion Removal for Hybrid Telepresence Systems
Surabhi Gupta, Ashwath Shetty, Avinash Sharma

TL;DR
This paper introduces an attention-based encoder-decoder model for removing HMD occlusions in VR telepresence, enabling realistic facial reconstruction from short videos and improving immersive communication.
Contribution
It presents a novel attention-enabled architecture and person-specific training method that generalizes to unseen poses and appearances, enhancing VR telepresence experiences.
Findings
Outperforms state-of-the-art occlusion removal methods.
Achieves high-quality facial reconstruction from minimal user videos.
Demonstrates applicability to hybrid teleconferencing with existing pipelines.
Abstract
Traditionally, video conferencing is a widely adopted solution for telecommunication, but a lack of immersiveness comes inherently due to the 2D nature of facial representation. The integration of Virtual Reality (VR) in a communication/telepresence system through Head Mounted Displays (HMDs) promises to provide users a much better immersive experience. However, HMDs cause hindrance by blocking the facial appearance and expressions of the user. To overcome these issues, we propose a novel attention-enabled encoder-decoder architecture for HMD de-occlusion. We also propose to train our person-specific model using short videos (1-2 minutes) of the user, captured in varying appearances, and demonstrated generalization to unseen poses and appearances of the user. We report superior qualitative and quantitative results over state-of-the-art methods. We also present applications of this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Facial Nerve Paralysis Treatment and Research · Generative Adversarial Networks and Image Synthesis
