TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers
Xuyang Bai, Zeyu Hu, Xinge Zhu, Qingqiu Huang, Yilun Chen, Hongbo Fu,, Chiew-Lan Tai

TL;DR
TransFusion introduces a transformer-based LiDAR-camera fusion method that adaptively combines sensor data, enhancing robustness against poor image conditions and calibration errors in 3D object detection for autonomous driving.
Contribution
It proposes a novel soft-association fusion mechanism using transformers, improving robustness and accuracy over existing hard-association methods.
Findings
Achieves state-of-the-art performance on large-scale datasets.
Demonstrates robustness against degraded image quality and calibration errors.
Extends effectively to 3D tracking, winning first place in nuScenes leaderboard.
Abstract
LiDAR and camera are two important sensors for 3D object detection in autonomous driving. Despite the increasing popularity of sensor fusion in this field, the robustness against inferior image conditions, e.g., bad illumination and sensor misalignment, is under-explored. Existing fusion methods are easily affected by such conditions, mainly due to a hard association of LiDAR points and image pixels, established by calibration matrices. We propose TransFusion, a robust solution to LiDAR-camera fusion with a soft-association mechanism to handle inferior image conditions. Specifically, our TransFusion consists of convolutional backbones and a detection head based on a transformer decoder. The first layer of the decoder predicts initial bounding boxes from a LiDAR point cloud using a sparse set of object queries, and its second decoder layer adaptively fuses the object queries with useful…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced X-ray and CT Imaging
