Echoes Beyond Points: Unleashing the Power of Raw Radar Data in Multi-modality Fusion
Yang Liu, Feng Wang, Naiyan Wang, Zhaoxiang Zhang

TL;DR
This paper introduces EchoFusion, a novel multi-modality fusion method that directly utilizes raw radar data to improve autonomous driving perception, surpassing existing radar-based methods and approaching LiDAR performance.
Contribution
EchoFusion skips traditional radar signal processing, directly fusing raw radar spectrum features with other sensors for enhanced perception in autonomous driving.
Findings
Outperforms existing radar-based fusion methods on RADIal dataset
Approaches LiDAR-level performance using raw radar data
Effectively combines semantic and distance clues from multiple sensors
Abstract
Radar is ubiquitous in autonomous driving systems due to its low cost and good adaptability to bad weather. Nevertheless, the radar detection performance is usually inferior because its point cloud is sparse and not accurate due to the poor azimuth and elevation resolution. Moreover, point cloud generation algorithms already drop weak signals to reduce the false targets which may be suboptimal for the use of deep fusion. In this paper, we propose a novel method named EchoFusion to skip the existing radar signal processing pipeline and then incorporate the radar raw data with other sensors. Specifically, we first generate the Bird's Eye View (BEV) queries and then take corresponding spectrum features from radar to fuse with other sensors. By this approach, our method could utilize both rich and lossless distance and speed clues from radar echoes and rich semantic clues from images,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsRemote Sensing and LiDAR Applications · Advanced Optical Sensing Technologies · Advanced Neural Network Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
