TransCAR: Transformer-based Camera-And-Radar Fusion for 3D Object Detection
Su Pang, Daniel Morris, Hayder Radha

TL;DR
TransCAR introduces a transformer-based fusion method combining camera and radar data for 3D object detection, enhancing velocity estimation and outperforming existing approaches on nuScenes dataset.
Contribution
The paper presents a novel Transformer-based Camera-Radar fusion framework that adaptively learns sensor data associations for improved 3D detection.
Findings
Outperforms state-of-the-art Camera-Radar fusion methods on nuScenes.
Improves velocity estimation without relying on temporal data.
Effectively learns soft associations between radar and vision features.
Abstract
Despite radar's popularity in the automotive industry, for fusion-based 3D object detection, most existing works focus on LiDAR and camera fusion. In this paper, we propose TransCAR, a Transformer-based Camera-And-Radar fusion solution for 3D object detection. Our TransCAR consists of two modules. The first module learns 2D features from surround-view camera images and then uses a sparse set of 3D object queries to index into these 2D features. The vision-updated queries then interact with each other via transformer self-attention layer. The second module learns radar features from multiple radar scans and then applies transformer decoder to learn the interactions between radar features and vision-updated queries. The cross-attention layer within the transformer decoder can adaptively learn the soft-association between the radar features and vision-updated queries instead of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced SAR Imaging Techniques · Advanced Neural Network Applications · Infrared Target Detection Methodologies
