CollaBot: Vision-Language Guided Simultaneous Collaborative Manipulation
Kun Song, Shentao Ma, Gaoming Chen, Ninglong Jin, Guangbao Zhao, Mingyu Ding, Zhenhua Xiong, Jia Pan

TL;DR
CollaBot is a versatile framework enabling multiple robots to collaboratively manipulate large objects by integrating scene segmentation, grasp planning, and collision-free trajectory generation, demonstrating effectiveness across various scenarios.
Contribution
This work introduces CollaBot, a scalable and generalist framework for multi-robot collaborative manipulation that generalizes to different robot sizes and task types.
Findings
52% success rate across various scenarios
Effective scene segmentation and grasp planning
Collision-free trajectory generation demonstrated
Abstract
A central research topic in robotics is how to use this system to interact with the physical world. Traditional manipulation tasks primarily focus on small objects. However, in factory or home environments, there is often a need for the movement of large objects, such as moving tables. These tasks typically require multi-robot systems to work collaboratively. Previous research lacks a framework that can scale to arbitrary sizes of robots and generalize to various kinds of tasks. In this work, we propose CollaBot, a generalist framework for simultaneous collaborative manipulation. First, we use SEEM for scene segmentation and point cloud extraction of the target object. Then, we propose a collaborative grasping framework, which decomposes the task into local grasp pose generation and global collaboration. Finally, we design a 2-stage planning module that can generate collision-free…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
