TL;DR
This paper presents SCIM, a framework enabling robots to autonomously discover new semantic classes and improve their perception in unknown environments through self-supervised learning and multi-modal data fusion.
Contribution
The paper introduces a novel framework for simultaneous clustering, inference, and mapping that allows robots to learn new semantic classes during deployment without supervision.
Findings
Fusion of multiple observation modalities enhances novel object discovery.
Clustering parameters can be optimized dynamically during deployment.
The approach improves semantic segmentation accuracy in open-world environments.
Abstract
In order to operate in human environments, a robot's semantic perception has to overcome open-world challenges such as novel objects and domain gaps. Autonomous deployment to such environments therefore requires robots to update their knowledge and learn without supervision. We investigate how a robot can autonomously discover novel semantic classes and improve accuracy on known classes when exploring an unknown environment. To this end, we develop a general framework for mapping and clustering that we then use to generate a self-supervised learning signal to update a semantic segmentation model. In particular, we show how clustering parameters can be optimized during deployment and that fusion of multiple observation modalities improves novel object discovery compared to prior work. Models, data, and implementations can be found at https://github.com/hermannsblum/scim
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
