SAM-guided Pseudo Label Enhancement for Multi-modal 3D Semantic   Segmentation

Mingyu Yang; Jitong Lu; Hun-Seok Kim

arXiv:2502.00960·cs.CV·February 4, 2025

SAM-guided Pseudo Label Enhancement for Multi-modal 3D Semantic Segmentation

Mingyu Yang, Jitong Lu, Hun-Seok Kim

PDF

Open Access

TL;DR

This paper introduces a SAM-guided pseudo-label enhancement method that leverages 2D prior knowledge to improve the quality and quantity of pseudo-labels, significantly boosting cross-domain 3D semantic segmentation performance.

Contribution

It proposes a novel image-guided pseudo-label refinement approach using SAM masks and Geometry-Aware Progressive Propagation for better domain adaptation in multi-modal 3D segmentation.

Findings

01

Increases high-quality pseudo-labels significantly.

02

Improves domain adaptation performance across datasets.

03

Outperforms baseline methods in experiments.

Abstract

Multi-modal 3D semantic segmentation is vital for applications such as autonomous driving and virtual reality (VR). To effectively deploy these models in real-world scenarios, it is essential to employ cross-domain adaptation techniques that bridge the gap between training data and real-world data. Recently, self-training with pseudo-labels has emerged as a predominant method for cross-domain adaptation in multi-modal 3D semantic segmentation. However, generating reliable pseudo-labels necessitates stringent constraints, which often result in sparse pseudo-labels after pruning. This sparsity can potentially hinder performance improvement during the adaptation process. We propose an image-guided pseudo-label enhancement approach that leverages the complementary 2D prior knowledge from the Segment Anything Model (SAM) to introduce more reliable pseudo-labels, thereby boosting domain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Handwritten Text Recognition Techniques · Industrial Vision Systems and Defect Detection

MethodsSegment Anything Model