BEV-DG: Cross-Modal Learning under Bird's-Eye View for Domain   Generalization of 3D Semantic Segmentation

Miaoyu Li; Yachao Zhang; Xu MA; Yanyun Qu; Yun Fu

arXiv:2308.06530·cs.CV·August 15, 2023

BEV-DG: Cross-Modal Learning under Bird's-Eye View for Domain Generalization of 3D Semantic Segmentation

Miaoyu Li, Yachao Zhang, Xu MA, Yanyun Qu, Yun Fu

PDF

Open Access

TL;DR

This paper introduces BEV-DG, a novel cross-modal learning approach under bird's-eye view for domain generalization in 3D semantic segmentation, effectively handling domain gaps without target domain access during training.

Contribution

The paper proposes BEV-based area-to-area fusion and BEV-driven domain contrastive learning to improve domain generalization in 3D segmentation tasks.

Findings

01

BEV-DG outperforms state-of-the-art methods across three datasets.

02

The approach demonstrates high fault tolerance to point-level misalignment.

03

Significant margins achieved in all experimental settings.

Abstract

Cross-modal Unsupervised Domain Adaptation (UDA) aims to exploit the complementarity of 2D-3D data to overcome the lack of annotation in a new domain. However, UDA methods rely on access to the target domain during training, meaning the trained model only works in a specific target domain. In light of this, we propose cross-modal learning under bird's-eye view for Domain Generalization (DG) of 3D semantic segmentation, called BEV-DG. DG is more challenging because the model cannot access the target domain during training, meaning it needs to rely on cross-modal learning to alleviate the domain gap. Since 3D semantic segmentation requires the classification of each point, existing cross-modal learning is directly conducted point-to-point, which is sensitive to the misalignment in projections between pixels and points. To this end, our approach aims to optimize domain-irrelevant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Infrastructure Maintenance and Monitoring

MethodsContrastive Learning