Unleashing HyDRa: Hybrid Fusion, Depth Consistency and Radar for Unified   3D Perception

Philipp Wolters; Johannes Gilg; Torben Teepe; Fabian Herzog; Anouar; Laouichi; Martin Hofmann; Gerhard Rigoll

arXiv:2403.07746·cs.CV·March 27, 2025·3 cites

Unleashing HyDRa: Hybrid Fusion, Depth Consistency and Radar for Unified 3D Perception

Philipp Wolters, Johannes Gilg, Torben Teepe, Fabian Herzog, Anouar, Laouichi, Martin Hofmann, Gerhard Rigoll

PDF

Open Access 1 Repo

TL;DR

HyDRa introduces a novel camera-radar fusion architecture that enhances 3D perception for autonomous driving by combining features in multiple representations, achieving state-of-the-art results in depth prediction and occupancy estimation.

Contribution

The paper presents a hybrid fusion approach with a Height Association Transformer and Radar-weighted Depth Consistency, improving depth prediction and BEV feature quality in camera-radar systems.

Findings

01

Achieves 64.2 NDS on nuScenes, surpassing previous methods.

02

Outperforms all camera-based methods on Occ3D with 3.7 mIoU.

03

Introduces a new fusion architecture with state-of-the-art results.

Abstract

Low-cost, vision-centric 3D perception systems for autonomous driving have made significant progress in recent years, narrowing the gap to expensive LiDAR-based methods. The primary challenge in becoming a fully reliable alternative lies in robust depth prediction capabilities, as camera-based systems struggle with long detection ranges and adverse lighting and weather conditions. In this work, we introduce HyDRa, a novel camera-radar fusion architecture for diverse 3D perception tasks. Building upon the principles of dense BEV (Bird's Eye View)-based architectures, HyDRa introduces a hybrid fusion approach to combine the strengths of complementary camera and radar features in two distinct representation spaces. Our Height Association Transformer module leverages radar features already in the perspective view to produce more robust and accurate depth predictions. In the BEV, we refine…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

phi-wol/hydra
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGeophysical Methods and Applications · Robotics and Sensor-Based Localization · Target Tracking and Data Fusion in Sensor Networks

MethodsAttention Is All You Need · Hydra · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Layer Normalization · Absolute Position Encodings · Dropout · Softmax