Multimodal 3D Object Detection on Unseen Domains

Deepti Hegde; Suhas Lohit; Kuan-Chuan Peng; Michael J. Jones; Vishal; M. Patel

arXiv:2404.11764·cs.CV·April 19, 2024·1 cites

Multimodal 3D Object Detection on Unseen Domains

Deepti Hegde, Suhas Lohit, Kuan-Chuan Peng, Michael J. Jones, Vishal, M. Patel

PDF

Open Access

TL;DR

This paper introduces CLIX$^ ext{3D}$, a multimodal fusion and contrastive learning framework that enhances 3D object detection robustness across unseen domains by leveraging paired LiDAR-image data and promoting feature invariance.

Contribution

The paper presents a novel multimodal fusion and contrastive learning approach for 3D object detection that improves generalization to unseen domains without requiring target domain data during training.

Findings

01

CLIX$^ ext{3D}$ achieves state-of-the-art domain generalization performance.

02

Multimodal features improve robustness to domain shifts.

03

Feature invariance across source domains enhances unseen domain detection.

Abstract

LiDAR datasets for autonomous driving exhibit biases in properties such as point cloud density, range, and object dimensions. As a result, object detection networks trained and evaluated in different environments often experience performance degradation. Domain adaptation approaches assume access to unannotated samples from the test distribution to address this problem. However, in the real world, the exact conditions of deployment and access to samples representative of the test dataset may be unavailable while training. We argue that the more realistic and challenging formulation is to require robustness in performance to unseen target domains. We propose to address this problem in a two-pronged manner. First, we leverage paired LiDAR-image data present in most autonomous driving datasets to perform multimodal object detection. We suggest that working with multimodal features by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization

MethodsContrastive Learning