Self-Supervised Unseen Object Instance Segmentation via Long-Term Robot Interaction
Yangxiao Lu, Ninad Khargonkar, Zesheng Xu, Charles Averill, Kamalesh, Palanisamy, Kaiyu Hang, Yunhui Guo, Nicholas Ruozzi, Yu Xiang

TL;DR
This paper presents a robotic system that improves unseen object segmentation by leveraging long-term interactions and self-supervised learning, significantly enhancing segmentation accuracy and grasping performance in real-world scenarios.
Contribution
The system defers segmentation decisions over multiple actions and uses multi-object tracking and video segmentation for self-supervised learning, a novel approach in robotic perception.
Findings
Improved segmentation accuracy after fine-tuning with real-world data.
Enhanced grasping success on unseen objects after system training.
Effective cross-domain generalization of segmentation networks.
Abstract
We introduce a novel robotic system for improving unseen object instance segmentation in the real world by leveraging long-term robot interaction with objects. Previous approaches either grasp or push an object and then obtain the segmentation mask of the grasped or pushed object after one action. Instead, our system defers the decision on segmenting objects after a sequence of robot pushing actions. By applying multi-object tracking and video object segmentation on the images collected via robot pushing, our system can generate segmentation masks of all the objects in these images in a self-supervised way. These include images where objects are very close to each other, and segmentation errors usually occur on these images for existing object segmentation networks. We demonstrate the usefulness of our system by fine-tuning segmentation networks trained on synthetic data with real-world…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
