Visual Foundation Models Boost Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation
Jingyi Xu, Weidong Yang, Lingdong Kong, Youquan Liu, Rui Zhang,, Qingyuan Zhou, Ben Fei

TL;DR
This paper introduces VFMSeg, a novel framework leveraging visual foundation models to improve cross-modal unsupervised domain adaptation for 3D semantic segmentation, significantly enhancing label accuracy and segmentation performance.
Contribution
The work proposes a new pipeline that uses pre-trained visual foundation models to generate more accurate pseudo labels and guide data augmentation for better domain adaptation in 3D segmentation.
Findings
Significant improvement in 3D segmentation accuracy on autonomous driving datasets.
Effective use of visual foundation models for label generation and data augmentation.
Enhanced cross-modal domain adaptation performance.
Abstract
Unsupervised domain adaptation (UDA) is vital for alleviating the workload of labeling 3D point cloud data and mitigating the absence of labels when facing a newly defined domain. Various methods of utilizing images to enhance the performance of cross-domain 3D segmentation have recently emerged. However, the pseudo labels, which are generated from models trained on the source domain and provide additional supervised signals for the unseen domain, are inadequate when utilized for 3D segmentation due to their inherent noisiness and consequently restrict the accuracy of neural networks. With the advent of 2D visual foundation models (VFMs) and their abundant knowledge prior, we propose a novel pipeline VFMSeg to further enhance the cross-modal unsupervised domain adaptation framework by leveraging these models. In this work, we study how to harness the knowledge priors learned by VFMs to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · 3D Shape Modeling and Analysis
