Language-Guided Object-Centric Diffusion Policy for Generalizable and Collision-Aware Robotic Manipulation
Hang Li, Qian Feng, Zhi Zheng, Jianxiang Feng, Zhaopeng Chen, Alois, Knoll

TL;DR
This paper presents Lan-o3dp, a language-guided diffusion policy for robotic manipulation that generalizes to unseen scenarios, incorporates collision avoidance, and achieves high success with minimal demonstrations.
Contribution
Introduces a novel object-centric diffusion policy conditioned on 3D point clouds, integrating language understanding and cost-guided trajectory optimization for collision-aware manipulation.
Findings
Achieves 68.7% success rate in simulation across 21 tasks.
Demonstrates effective zero-shot collision avoidance in real-world tests.
Outperforms baselines using 2D and 3D scene representations.
Abstract
Learning from demonstrations faces challenges in generalizing beyond the training data and often lacks collision awareness. This paper introduces Lan-o3dp, a language-guided object-centric diffusion policy framework that can adapt to unseen situations such as cluttered scenes, shifting camera views, and ambiguous similar objects while offering training-free collision avoidance and achieving a high success rate with few demonstrations. We train a diffusion model conditioned on 3D point clouds of task-relevant objects to predict the robot's end-effector trajectories, enabling it to complete the tasks. During inference, we incorporate cost optimization into denoising steps to guide the generated trajectory to be collision-free. We leverage open-set segmentation to obtain the 3D point clouds of related objects. We use a large language model to identify the target objects and possible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Modular Robots and Swarm Intelligence · Reinforcement Learning in Robotics
