Depth-Guided Semi-Supervised Instance Segmentation

Xin Chen; Jie Hu; Xiawu Zheng; Jianghang Lin; Liujuan Cao; Rongrong Ji

arXiv:2406.17413·cs.CV·June 26, 2024·2 cites

Depth-Guided Semi-Supervised Instance Segmentation

Xin Chen, Jie Hu, Xiawu Zheng, Jianghang Lin, Liujuan Cao, Rongrong Ji

PDF

Open Access

TL;DR

This paper introduces a depth-guided semi-supervised instance segmentation framework that leverages depth maps to improve pseudo-label accuracy and model performance, outperforming previous RGB-only methods on COCO and Cityscapes datasets.

Contribution

The paper proposes a novel depth-guided framework with depth feature fusion and depth controller to enhance semi-supervised instance segmentation accuracy.

Findings

01

Achieves state-of-the-art results on COCO with 22.29% mAP at 1% labeled data.

02

Outperforms previous methods on Cityscapes dataset.

03

Demonstrates effective integration of depth information improves segmentation performance.

Abstract

Semi-Supervised Instance Segmentation (SSIS) aims to leverage an amount of unlabeled data during training. Previous frameworks primarily utilized the RGB information of unlabeled images to generate pseudo-labels. However, such a mechanism often introduces unstable noise, as a single instance can display multiple RGB values. To overcome this limitation, we introduce a Depth-Guided (DG) SSIS framework. This framework uses depth maps extracted from input images, which represent individual instances with closely associated distance values, offering precise contours for distinct instances. Unlike RGB data, depth maps provide a unique perspective, making their integration into the SSIS process complex. To this end, we propose Depth Feature Fusion, which integrates features extracted from depth estimation. This integration allows the model to understand depth information better and ensure its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIndustrial Vision Systems and Defect Detection · Machine Learning and Data Classification

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings