Visual Foundation Models Boost Cross-Modal Unsupervised Domain   Adaptation for 3D Semantic Segmentation

Jingyi Xu; Weidong Yang; Lingdong Kong; Youquan Liu; Rui Zhang,; Qingyuan Zhou; Ben Fei

arXiv:2403.10001·cs.CV·March 18, 2024·2 cites

Visual Foundation Models Boost Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation

Jingyi Xu, Weidong Yang, Lingdong Kong, Youquan Liu, Rui Zhang,, Qingyuan Zhou, Ben Fei

PDF

Open Access 1 Repo

TL;DR

This paper introduces VFMSeg, a novel framework leveraging visual foundation models to improve cross-modal unsupervised domain adaptation for 3D semantic segmentation, significantly enhancing label accuracy and segmentation performance.

Contribution

The work proposes a new pipeline that uses pre-trained visual foundation models to generate more accurate pseudo labels and guide data augmentation for better domain adaptation in 3D segmentation.

Findings

01

Significant improvement in 3D segmentation accuracy on autonomous driving datasets.

02

Effective use of visual foundation models for label generation and data augmentation.

03

Enhanced cross-modal domain adaptation performance.

Abstract

Unsupervised domain adaptation (UDA) is vital for alleviating the workload of labeling 3D point cloud data and mitigating the absence of labels when facing a newly defined domain. Various methods of utilizing images to enhance the performance of cross-domain 3D segmentation have recently emerged. However, the pseudo labels, which are generated from models trained on the source domain and provide additional supervised signals for the unseen domain, are inadequate when utilized for 3D segmentation due to their inherent noisiness and consequently restrict the accuracy of neural networks. With the advent of 2D visual foundation models (VFMs) and their abundant knowledge prior, we propose a novel pipeline VFMSeg to further enhance the cross-modal unsupervised domain adaptation framework by leveraging these models. In this work, we study how to harness the knowledge priors learned by VFMs to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

etrontech/vfmseg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · 3D Shape Modeling and Analysis