TL;DR
R4Det introduces a novel fusion framework combining radar and camera data to improve 3D object detection in autonomous driving, addressing depth estimation, temporal fusion, and small object detection challenges.
Contribution
It proposes new modules for depth fusion, pose-independent temporal fusion, and semantic refinement, achieving state-of-the-art results in 3D detection.
Findings
Achieves state-of-the-art results on TJ4DRadSet and VoD datasets.
Enhances depth estimation through Panoramic Depth Fusion.
Develops a pose-independent temporal fusion module.
Abstract
4D radar-camera sensing configuration has gained increasing importance in autonomous driving. However, existing 3D object detection methods that fuse 4D Radar and camera data confront several challenges. First, their absolute depth estimation module is not robust and accurate enough, leading to inaccurate 3D localization. Second, the performance of their temporal fusion module will degrade dramatically or even fail when the ego vehicle's pose is missing or inaccurate. Third, for some small objects, the sparse radar point clouds may completely fail to reflect from their surfaces. In such cases, detection must rely solely on visual unimodal priors. To address these limitations, we propose R4Det, which enhances depth estimation quality via the Panoramic Depth Fusion module, enabling mutual reinforcement between absolute and relative depth. For temporal fusion, we design a Deformable Gated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
