Multimodal 3D Object Detection on Unseen Domains
Deepti Hegde, Suhas Lohit, Kuan-Chuan Peng, Michael J. Jones, Vishal, M. Patel

TL;DR
This paper introduces CLIX$^ ext{3D}$, a multimodal fusion and contrastive learning framework that enhances 3D object detection robustness across unseen domains by leveraging paired LiDAR-image data and promoting feature invariance.
Contribution
The paper presents a novel multimodal fusion and contrastive learning approach for 3D object detection that improves generalization to unseen domains without requiring target domain data during training.
Findings
CLIX$^ ext{3D}$ achieves state-of-the-art domain generalization performance.
Multimodal features improve robustness to domain shifts.
Feature invariance across source domains enhances unseen domain detection.
Abstract
LiDAR datasets for autonomous driving exhibit biases in properties such as point cloud density, range, and object dimensions. As a result, object detection networks trained and evaluated in different environments often experience performance degradation. Domain adaptation approaches assume access to unannotated samples from the test distribution to address this problem. However, in the real world, the exact conditions of deployment and access to samples representative of the test dataset may be unavailable while training. We argue that the more realistic and challenging formulation is to require robustness in performance to unseen target domains. We propose to address this problem in a two-pronged manner. First, we leverage paired LiDAR-image data present in most autonomous driving datasets to perform multimodal object detection. We suggest that working with multimodal features by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization
MethodsContrastive Learning
