Boosting Instance Awareness via Cross-View Correlation with 4D Radar and Camera for 3D Object Detection
Xiaokai Bai, Lianqing Zheng, Si-Yuan Cao, Xiaohan Zhang, Zhe Wu, Beinan Yu, Fang Wang, Jie Bai, and Hui-Liang Shen

TL;DR
SIFormer is a transformer-based method that enhances 3D object detection by integrating 4D radar and camera data, addressing radar sparsity and improving instance awareness for autonomous driving.
Contribution
It introduces a cross-view activation mechanism and a transformer-based fusion module to effectively combine radar and camera information for better 3D detection.
Findings
Achieves state-of-the-art results on multiple datasets
Effectively suppresses background noise during view transformation
Enhances instance awareness by integrating 2D cues into BEV space
Abstract
4D millimeter-wave radar has emerged as a promising sensing modality for autonomous driving due to its robustness and affordability. However, its sparse and weak geometric cues make reliable instance activation difficult, limiting the effectiveness of existing radar-camera fusion paradigms. BEV-level fusion offers global scene understanding but suffers from weak instance focus, while perspective-level fusion captures instance details but lacks holistic context. To address these limitations, we propose SIFormer, a scene-instance aware transformer for 3D object detection using 4D radar and camera. SIFormer first suppresses background noise during view transformation through segmentation- and depth-guided localization. It then introduces a cross-view activation mechanism that injects 2D instance cues into BEV space, enabling reliable instance awareness under weak radar geometry. Finally, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced SAR Imaging Techniques · Advanced Neural Network Applications · Advanced Optical Sensing Technologies
