TL;DR
This paper introduces a semi-supervised method for household object segmentation that efficiently propagates labels across multiple classes using ensemble Hopfield networks and foundation model embeddings, reducing annotation effort.
Contribution
The method scales to 50 classes with limited annotations and automatically labels 60% of data in RoboCup@Home scenarios, enhancing robot perception capabilities.
Findings
Automatically labels 60% of data in RoboCup@Home setting.
Scales to 50 object classes with limited annotation overhead.
Uses ensemble Hopfield networks with foundation model embeddings.
Abstract
Reliable object perception is necessary for general-purpose service robots. Open-vocabulary detectors struggle to generalize beyond a few classes and fully supervised training of object detectors requires time-intensive annotations. We present a semi-supervised label propagation approach for household object segmentation. A segment proposer generates class-agnostic masks, and an ensemble of Hopfield networks assigns labels by learning representative embeddings in complementary foundation model embedding spaces (CLIP, ViT, Theia). Our approach scales to 50 object classes with limited annotation overhead and can automatically label 60% of the data in a RoboCup@Home setting, where preparation time is severely constrained. Dataset and code are publicly available at https://github.com/ais-bonn/label_propagation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
