TL;DR
This paper introduces a teacher-student learning framework for video object segmentation in human-robot interaction, enabling robots to learn and segment new objects through human guidance without manual labels.
Contribution
A novel teacher-student adaptation method for robot video segmentation and a new IVOS dataset with manipulation tasks and diverse transformations.
Findings
Outperforms state-of-the-art on DAVIS and FBMS datasets.
Significantly improves segmentation accuracy on the IVOS dataset.
Enables robots to learn new objects through human interaction without manual labels.
Abstract
Video object segmentation is an essential task in robot manipulation to facilitate grasping and learning affordances. Incremental learning is important for robotics in unstructured environments, since the total number of objects and their variations can be intractable. Inspired by the children learning process, human robot interaction (HRI) can be utilized to teach robots about the world guided by humans similar to how children learn from a parent or a teacher. A human teacher can show potential objects of interest to the robot, which is able to self adapt to the teaching signal without providing manual segmentation labels. We propose a novel teacher-student learning paradigm to teach robots about their surrounding environment. A two-stream motion and appearance "teacher" network provides pseudo-labels to adapt an appearance "student" network. The student network is able to segment the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
