Self-supervised Training of Proposal-based Segmentation via Background Prediction
Isinsu Katircioglu, Helge Rhodin, Victor Constantin, J\"org Sp\"orri,, Mathieu Salzmann, Pascal Fua

TL;DR
This paper introduces a self-supervised method for object detection and segmentation in monocular images, leveraging background reconstruction to improve generalization without requiring extensive annotations.
Contribution
The paper presents a novel self-supervised training approach that links segmentation with background reconstruction, using a Monte Carlo strategy to handle proposal discreteness.
Findings
Achieves accurate detection and segmentation in diverse, unseen images.
Outperforms existing self-supervised methods on benchmark tasks.
Approaches the performance of weakly supervised methods with less annotation.
Abstract
While supervised object detection methods achieve impressive accuracy, they generalize poorly to images whose appearance significantly differs from the data they have been trained on. To address this in scenarios where annotating data is prohibitively expensive, we introduce a self-supervised approach to object detection and segmentation, able to work with monocular images captured with a moving camera. At the heart of our approach lies the observation that segmentation and background reconstruction are linked tasks, and the idea that, because we observe a structured scene, background regions can be re-synthesized from their surroundings, whereas regions depicting the object cannot. We therefore encode this intuition as a self-supervised loss function that we exploit to train a proposal-based segmentation network. To account for the discrete nature of object proposals, we develop a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
