CoPa: General Robotic Manipulation through Spatial Constraints of Parts   with Foundation Models

Haoxu Huang; Fanqi Lin; Yingdong Hu; Shengjie Wang; Yang Gao

arXiv:2403.08248·cs.RO·March 14, 2024·1 cites

CoPa: General Robotic Manipulation through Spatial Constraints of Parts with Foundation Models

Haoxu Huang, Fanqi Lin, Yingdong Hu, Shengjie Wang, Yang Gao

PDF

Open Access

TL;DR

CoPa leverages foundation models to generate spatial constraints and end-effector poses for robotic manipulation, enabling general, open-world tasks without extensive task-specific training.

Contribution

Introduces a novel framework that uses foundation vision-language models for task-oriented grasping and motion planning, reducing the need for data collection and training.

Findings

01

Effective in real-world experiments with minimal prompt engineering

02

Handles open-set instructions and diverse objects

03

Integrates with existing planning algorithms for complex tasks

Abstract

Foundation models pre-trained on web-scale data are shown to encapsulate extensive world knowledge beneficial for robotic manipulation in the form of task planning. However, the actual physical implementation of these plans often relies on task-specific learning methods, which require significant data collection and struggle with generalizability. In this work, we introduce Robotic Manipulation through Spatial Constraints of Parts (CoPa), a novel framework that leverages the common sense knowledge embedded within foundation models to generate a sequence of 6-DoF end-effector poses for open-world robotic manipulation. Specifically, we decompose the manipulation process into two phases: task-oriented grasping and task-aware motion planning. In the task-oriented grasping phase, we employ foundation vision-language models (VLMs) to select the object's grasping part through a novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModular Robots and Swarm Intelligence · Manufacturing Process and Optimization · Robot Manipulation and Learning