Learning human-to-robot handovers through 3D scene reconstruction

Yuekun Wu; Yik Lung Pang; Andrea Cavallaro; Changjae Oh

arXiv:2507.08726·cs.RO·July 14, 2025

Learning human-to-robot handovers through 3D scene reconstruction

Yuekun Wu, Yik Lung Pang, Andrea Cavallaro, Changjae Oh

PDF

TL;DR

This paper introduces a novel method for learning robot handover policies directly from RGB images using Gaussian Splatting reconstruction, eliminating the need for real-robot training and enabling effective real-world deployment.

Contribution

It presents the first supervised learning approach for robot handovers from RGB images without real-robot data, utilizing Gaussian Splatting for scene reconstruction and policy transfer.

Findings

01

Effective policy learned from 16 objects

02

Successful real-world handover demonstrations

03

Scene reconstruction improves policy transfer

Abstract

Learning robot manipulation policies from raw, real-world image data requires a large number of robot-action trials in the physical environment. Although training using simulations offers a cost-effective alternative, the visual domain gap between simulation and robot workspace remains a major limitation. Gaussian Splatting visual reconstruction methods have recently provided new directions for robot manipulation by generating realistic environments. In this paper, we propose the first method for learning supervised-based robot handovers solely from RGB images without the need of real-robot training or real-robot data collection. The proposed policy learner, Human-to-Robot Handover using Sparse-View Gaussian Splatting (H2RH-SGS), leverages sparse-view Gaussian Splatting reconstruction of human-to-robot handover scenes to generate robot demonstrations containing image-action pairs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.