Fusion-then-Distillation: Toward Cross-modal Positive Distillation for   Domain Adaptive 3D Semantic Segmentation

Yao Wu; Mingwei Xing; Yachao Zhang; Yuan Xie; Yanyun Qu

arXiv:2410.19446·cs.CV·October 28, 2024

Fusion-then-Distillation: Toward Cross-modal Positive Distillation for Domain Adaptive 3D Semantic Segmentation

Yao Wu, Mingwei Xing, Yachao Zhang, Yuan Xie, Yanyun Qu

PDF

Open Access 1 Repo

TL;DR

This paper introduces FtD++, a novel cross-modal fusion and distillation approach for domain adaptive 3D semantic segmentation, leveraging heterogeneous fusion and positive distillation to improve performance across domains.

Contribution

FtD++ uniquely combines feature fusion, positive distillation, and pseudo-labeling to enhance cross-modal domain adaptation in 3D segmentation tasks.

Findings

01

Achieves state-of-the-art results on multiple domain adaptation benchmarks.

02

Effectively aligns features across modalities and domains.

03

Improves robustness of 3D semantic segmentation in real-world scenarios.

Abstract

In cross-modal unsupervised domain adaptation, a model trained on source-domain data (e.g., synthetic) is adapted to target-domain data (e.g., real-world) without access to target annotation. Previous methods seek to mutually mimic cross-modal outputs in each domain, which enforces a class probability distribution that is agreeable in different domains. However, they overlook the complementarity brought by the heterogeneous fusion in cross-modal learning. In light of this, we propose a novel fusion-then-distillation (FtD++) method to explore cross-modal positive distillation of the source and target domains for 3D semantic segmentation. FtD++ realizes distribution consistency between outputs not only for 2D images and 3D point clouds but also for source-domain and augment-domain. Specially, our method contains three key ingredients. First, we present a model-agnostic feature fusion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

barcaaaa/ftd-plusplus
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques