A Good Foundation is Worth Many Labels: Label-Efficient Panoptic   Segmentation

Niclas V\"odisch; K\"ursat Petek; Markus K\"appeler; Abhinav Valada,; Wolfram Burgard

arXiv:2405.19035·cs.RO·December 4, 2024

A Good Foundation is Worth Many Labels: Label-Efficient Panoptic Segmentation

Niclas V\"odisch, K\"ursat Petek, Markus K\"appeler, Abhinav Valada,, Wolfram Burgard

PDF

Open Access 1 Repo

TL;DR

This paper introduces PASTEL, a label-efficient panoptic segmentation method that leverages foundation models, lightweight training, and self-training to achieve high accuracy with minimal annotations in robotic perception tasks.

Contribution

It proposes a novel fusion module and self-training scheme that significantly improve label-efficient segmentation performance using foundation model features.

Findings

01

Outperforms previous label-efficient segmentation methods

02

Effective in autonomous driving and agricultural robotics

03

Requires fewer annotations for high accuracy

Abstract

A key challenge for the widespread application of learning-based models for robotic perception is to significantly reduce the required amount of annotated training data while achieving accurate predictions. This is essential not only to decrease operating costs but also to speed up deployment time. In this work, we address this challenge for PAnoptic SegmenTation with fEw Labels (PASTEL) by exploiting the groundwork paved by visual foundation models. We leverage descriptive image features from such a model to train two lightweight network heads for semantic segmentation and object boundary detection, using very few annotated training samples. We then merge their predictions via a novel fusion module that yields panoptic maps based on normalized cut. To further enhance the performance, we utilize self-training on unlabeled images selected by a feature-driven similarity scheme. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

robot-learning-freiburg/PASTEL
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings