Cross-Level Sensor Fusion with Object Lists via Transformer for 3D Object Detection
Xiangzhong Liu, Jiajie Zhang, Hao Shen

TL;DR
This paper introduces an end-to-end Transformer-based method for 3D object detection that fuses processed object lists with raw camera images, improving accuracy and training efficiency in automotive sensor systems.
Contribution
It presents the first cross-level fusion approach combining object lists and raw images using Transformer, along with a novel pseudo-label generation method for training.
Findings
Significant performance gains over baseline on nuScenes dataset
Effective generalization across various noise levels in object lists
Accelerated training convergence due to Gaussian mask integration
Abstract
In automotive sensor fusion systems, smart sensors and Vehicle-to-Everything (V2X) modules are commonly utilized. Sensor data from these systems are typically available only as processed object lists rather than raw sensor data from traditional sensors. Instead of processing other raw data separately and then fusing them at the object level, we propose an end-to-end cross-level fusion concept with Transformer, which integrates highly abstract object list information with raw camera images for 3D object detection. Object lists are fed into a Transformer as denoising queries and propagated together with learnable queries through the latter feature aggregation process. Additionally, a deformable Gaussian mask, derived from the positional and size dimensional priors from the object lists, is explicitly integrated into the Transformer decoder. This directs attention toward the target area of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Autonomous Vehicle Technology and Safety · Adversarial Robustness in Machine Learning
