Zero-Splat TeleAssist: A Zero-Shot Pose Estimation Framework for Semantic Teleoperation

Srijan Dokania; Dharini Raghavan

arXiv:2512.08271·cs.RO·December 10, 2025

Zero-Splat TeleAssist: A Zero-Shot Pose Estimation Framework for Semantic Teleoperation

Srijan Dokania, Dharini Raghavan

PDF

Open Access

TL;DR

Zero-Splat TeleAssist presents a novel zero-shot sensor-fusion pipeline that enables real-time, fiducial-free 6-DoF pose estimation from CCTV streams for effective multilateral teleoperation.

Contribution

It introduces a new sensor-fusion framework combining vision-language segmentation, monocular depth, and 3D Gaussian Splatting for zero-shot, real-time robot pose estimation.

Findings

01

Provides real-time global robot positions and orientations

02

Operates without fiducials or depth sensors

03

Enables interaction-centric teleoperation

Abstract

We introduce Zero-Splat TeleAssist, a zero-shot sensor-fusion pipeline that transforms commodity CCTV streams into a shared, 6-DoF world model for multilateral teleoperation. By integrating vision-language segmentation, monocular depth, weighted-PCA pose extraction, and 3D Gaussian Splatting (3DGS), TeleAssist provides every operator with real-time global positions and orientations of multiple robots without fiducials or depth sensors in an interaction-centric teleoperation setup.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTeleoperation and Haptic Systems · Robot Manipulation and Learning · Hand Gesture Recognition Systems