Language-Guided Object-Centric Diffusion Policy for Generalizable and   Collision-Aware Robotic Manipulation

Hang Li; Qian Feng; Zhi Zheng; Jianxiang Feng; Zhaopeng Chen; Alois; Knoll

arXiv:2407.00451·cs.RO·March 18, 2025·1 cites

Language-Guided Object-Centric Diffusion Policy for Generalizable and Collision-Aware Robotic Manipulation

Hang Li, Qian Feng, Zhi Zheng, Jianxiang Feng, Zhaopeng Chen, Alois, Knoll

PDF

Open Access

TL;DR

This paper presents Lan-o3dp, a language-guided diffusion policy for robotic manipulation that generalizes to unseen scenarios, incorporates collision avoidance, and achieves high success with minimal demonstrations.

Contribution

Introduces a novel object-centric diffusion policy conditioned on 3D point clouds, integrating language understanding and cost-guided trajectory optimization for collision-aware manipulation.

Findings

01

Achieves 68.7% success rate in simulation across 21 tasks.

02

Demonstrates effective zero-shot collision avoidance in real-world tests.

03

Outperforms baselines using 2D and 3D scene representations.

Abstract

Learning from demonstrations faces challenges in generalizing beyond the training data and often lacks collision awareness. This paper introduces Lan-o3dp, a language-guided object-centric diffusion policy framework that can adapt to unseen situations such as cluttered scenes, shifting camera views, and ambiguous similar objects while offering training-free collision avoidance and achieving a high success rate with few demonstrations. We train a diffusion model conditioned on 3D point clouds of task-relevant objects to predict the robot's end-effector trajectories, enabling it to complete the tasks. During inference, we incorporate cost optimization into denoising steps to guide the generated trajectory to be collision-free. We leverage open-set segmentation to obtain the 3D point clouds of related objects. We use a large language model to identify the target objects and possible…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Modular Robots and Swarm Intelligence · Reinforcement Learning in Robotics