Robotic Programmer: Video Instructed Policy Code Generation for Robotic   Manipulation

Senwei Xie; Hongyu Wang; Zhanqi Xiao; Ruiping Wang; Xilin Chen

arXiv:2501.04268·cs.RO·January 9, 2025

Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation

Senwei Xie, Hongyu Wang, Zhanqi Xiao, Ruiping Wang, Xilin Chen

PDF

Open Access

TL;DR

RoboPro is a foundation model enabling zero-shot robotic manipulation by synthesizing executable code from videos and instructions, achieving state-of-the-art success rates in simulation and real-world tasks.

Contribution

The paper introduces RoboPro, a robotic foundation model that perceives visual input and follows instructions to generate policy code in zero-shot settings, with a novel Video2Code synthesis method.

Findings

01

RoboPro surpasses previous models in zero-shot success rate by 11.6% on RLBench.

02

RoboPro performs comparably to supervised baselines in real-world tasks.

03

The model is robust to variations in API formats and skill sets.

Abstract

Zero-shot generalization across various robots, tasks and environments remains a significant challenge in robotic manipulation. Policy code generation methods use executable code to connect high-level task descriptions and low-level action sequences, leveraging the generalization capabilities of large language models and atomic skill libraries. In this work, we propose Robotic Programmer (RoboPro), a robotic foundation model, enabling the capability of perceiving visual information and following free-form instructions to perform robotic manipulation with policy code in a zero-shot manner. To address low efficiency and high cost in collecting runtime code data for robotic tasks, we devise Video2Code to synthesize executable code from extensive videos in-the-wild with off-the-shelf vision-language model and code-domain large language model. Extensive experiments show that RoboPro achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robotic Path Planning Algorithms · Robot Manipulation and Learning