A Multi-View 3D Telepresence System for XR Robot Teleoperation
Enes Ulas Dincer, Manuel Zaremski, Alexandra Nick, Elias Wucher, Barbara Deml, Gerhard Neumann

TL;DR
This paper presents a multi-view VR telepresence system for robot teleoperation that fuses multi-camera geometry and local RGB streams, improving task performance and usability in manipulation tasks.
Contribution
The authors introduce a real-time, GPU-accelerated multi-view VR system combining point clouds and local RGB streams, enhancing telepresence for robot manipulation.
Findings
The system supports real-time rendering of ~75k points on Meta Quest 3.
Participants achieved higher success and lower workload with the proposed system.
Combining global 3D structure with local high-res detail improves teleoperation performance.
Abstract
Robot teleoperation is critical for applications such as remote maintenance, fleet robotics, search and rescue, and data collection for robot learning. Effective teleoperation requires intuitive 3D visualization with reliable depth cues, which conventional screen-based interfaces often fail to provide. We introduce a multi-view VR telepresence system that (1) fuses geometry from three cameras to produce GPU-accelerated point-cloud rendering on standalone VR hardware, and (2) integrates a wrist-mounted RGB stream to provide high-resolution local detail where point-cloud accuracy is limited. Our pipeline supports real-time rendering of approximately 75k points on the Meta Quest 3. A within-subject study was conducted with 31 participants to compare our system to other visualisation modalities, such as RGB streams, a projection of stereo-vision directly in the VR device and point clouds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
