TL;DR
This paper introduces a self-supervised approach for an active agent to learn object segmentation through interaction, generalizing to new objects and backgrounds, and demonstrates its utility in a robotic rearrangement task.
Contribution
It presents a novel self-supervised learning method for instance segmentation via interaction, including a robust set loss and a new robotic interaction dataset.
Findings
The model generalizes to novel objects and backgrounds.
The approach achieves effective segmentation with over 50K interactions.
Demonstrated success in a downstream robotic rearrangement task.
Abstract
We present an approach for building an active agent that learns to segment its visual observations into individual objects by interacting with its environment in a completely self-supervised manner. The agent uses its current segmentation model to infer pixels that constitute objects and refines the segmentation model by interacting with these pixels. The model learned from over 50K interactions generalizes to novel objects and backgrounds. To deal with noisy training signal for segmenting objects obtained by self-supervised interactions, we propose robust set loss. A dataset of robot's interactions along-with a few human labeled examples is provided as a benchmark for future research. We test the utility of the learned segmentation model by providing results on a downstream vision-based control task of rearranging multiple objects into target configurations from visual inputs alone.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
