MUSDA: Multi-source Multi-modality Unsupervised Domain Adaptive 3D Object Detection for Autonomous Driving

Xiaohu Lu; Hamed Khatounabadi; Hayder Radha

arXiv:2605.10026·cs.CV·May 12, 2026

MUSDA: Multi-source Multi-modality Unsupervised Domain Adaptive 3D Object Detection for Autonomous Driving

Xiaohu Lu, Hamed Khatounabadi, Hayder Radha

PDF

TL;DR

This paper introduces a novel multi-source, multi-modality unsupervised domain adaptation framework for 3D object detection in autonomous driving, effectively integrating camera and LiDAR data across multiple datasets.

Contribution

It proposes hierarchical spatially-conditioned domain classifiers and a prototype graph weighted fusion strategy to improve multi-source, multi-modality domain adaptation in 3D detection.

Findings

01

Outperforms state-of-the-art methods on Waymo, nuScenes, and Lyft datasets.

02

Effectively aligns features from camera and LiDAR modalities across domains.

03

Leverages multiple source domains to enhance detection in unlabeled target domains.

Abstract

With the advancement of autonomous driving, numerous annotated multi-modality datasets have become available. This presents an opportunity to develop domain-adaptive 3D object detectors for new environments without relying on labor-intensive manual annotations. However, traditional domain adaptation methods typically focus on a single source domain or a single modality, limiting their effectiveness in multi-source, multi-modality scenarios. In this paper, we propose a novel framework for multi-source, multi-modality unsupervised domain adaptation in 3D object detection for autonomous driving. Given multiple labeled source domains and one unlabeled target domain, our framework first introduces hierarchical spatially-conditioned (HSC) domain classifiers, which jointly align features from both camera and LiDAR modalities at two distinct levels for each source-target domain pair. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.