HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object   Detection

Tim Broedermann (1); Christos Sakaridis (1); Dengxin Dai (2); Luc; Van Gool (1; 3) ((1) ETH Zurich; (2) MPI for Informatics; (3) KU Leuven)

arXiv:2206.15157·cs.CV·August 14, 2023

HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection

Tim Broedermann (1), Christos Sakaridis (1), Dengxin Dai (2), Luc, Van Gool (1, 3) ((1) ETH Zurich, (2) MPI for Informatics, (3) KU Leuven)

PDF

1 Repo

TL;DR

HRFuser is a modular multi-resolution sensor fusion architecture for 2D object detection in autonomous vehicles, effectively combining multiple sensor modalities to outperform existing methods.

Contribution

It introduces a novel multi-resolution fusion architecture with a multi-window cross-attention block, scalable to multiple sensor types, advancing multi-modal perception in autonomous driving.

Findings

01

Significantly improves 2D object detection over camera-only models.

02

Outperforms state-of-the-art fusion methods on nuScenes and DENSE datasets.

03

Effectively leverages multiple sensor modalities for robust perception.

Abstract

Besides standard cameras, autonomous vehicles typically include multiple additional sensors, such as lidars and radars, which help acquire richer information for perceiving the content of the driving scene. While several recent works focus on fusing certain pairs of sensors - such as camera with lidar or radar - by using architectural components specific to the examined setting, a generic and modular sensor fusion architecture is missing from the literature. In this work, we propose HRFuser, a modular architecture for multi-modal 2D object detection. It fuses multiple sensors in a multi-resolution fashion and scales to an arbitrary number of input modalities. The design of HRFuser is based on state-of-the-art high-resolution networks for image-only dense prediction and incorporates a novel multi-window cross-attention block as the means to perform fusion of multiple modalities at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

timbroed/hrfuser
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.