DepthMatch: Semi-Supervised RGB-D Scene Parsing through Depth-Guided Regularization

Jianxin Huang; Jiahang Li; Sergey Vityazev; Alexander Dvorkovich; Rui Fan

arXiv:2505.20041·cs.CV·May 27, 2025

DepthMatch: Semi-Supervised RGB-D Scene Parsing through Depth-Guided Regularization

Jianxin Huang, Jiahang Li, Sergey Vityazev, Alexander Dvorkovich, Rui Fan

PDF

Open Access

TL;DR

DepthMatch is a semi-supervised RGB-D scene parsing framework that leverages depth-guided regularization, patch mix-up augmentation, and a lightweight spatial prior to improve boundary detection and achieve state-of-the-art results.

Contribution

The paper introduces DepthMatch, a novel semi-supervised learning approach for RGB-D scene parsing that effectively utilizes unlabeled data through innovative augmentation and fusion techniques.

Findings

01

Achieves state-of-the-art results on NYUv2 dataset.

02

Ranks first on KITTI Semantics benchmark.

03

Demonstrates high applicability in indoor and outdoor scenes.

Abstract

RGB-D scene parsing methods effectively capture both semantic and geometric features of the environment, demonstrating great potential under challenging conditions such as extreme weather and low lighting. However, existing RGB-D scene parsing methods predominantly rely on supervised training strategies, which require a large amount of manually annotated pixel-level labels that are both time-consuming and costly. To overcome these limitations, we introduce DepthMatch, a semi-supervised learning framework that is specifically designed for RGB-D scene parsing. To make full use of unlabeled data, we propose complementary patch mix-up augmentation to explore the latent relationships between texture and spatial features in RGB-D image pairs. We also design a lightweight spatial prior injector to replace traditional complex fusion modules, improving the efficiency of heterogeneous feature…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Medical Image Segmentation Techniques · Advanced Image and Video Retrieval Techniques