REFNet++: Multi-Task Efficient Fusion of Camera and Radar Sensor Data in Bird's-Eye Polar View
Kavin Chandrasekaran, Sorin Grigorescu, Gijs Dubbelman, Pavol Jancura

TL;DR
This paper introduces REFNet++, a multimodal sensor fusion method that aligns radar and camera data in a unified Bird's-Eye Polar View domain for improved vehicle perception.
Contribution
It proposes a variational encoder-decoder architecture that transforms camera and radar data into a common domain, enhancing fusion efficiency and accuracy.
Findings
Outperforms state-of-the-art methods on RADIal dataset for vehicle detection.
Achieves robust sensor fusion with improved computational efficiency.
Effectively aligns radar and camera data in a unified BEV polar domain.
Abstract
A realistic view of the vehicle's surroundings is generally offered by camera sensors, which is crucial for environmental perception. Affordable radar sensors, on the other hand, are becoming invaluable due to their robustness in variable weather conditions. However, because of their noisy output and reduced classification capability, they work best when combined with other sensor data. Specifically, we address the challenge of multimodal sensor fusion by aligning radar and camera data in a unified domain, prioritizing not only accuracy, but also computational efficiency. Our work leverages the raw range-Doppler (RD) spectrum from radar and front-view camera images as inputs. To enable effective fusion, we employ a variational encoder-decoder architecture that learns the transformation of front-view camera data into the Bird's-Eye View (BEV) polar domain. Concurrently, a radar…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
