ROA-BEV: 2D Region-Oriented Attention for BEV-based 3D Object Detection

Jiwei Chen; Yubao Sun; Laiyan Ding; Rui Huang

arXiv:2410.10298·cs.CV·June 27, 2025

ROA-BEV: 2D Region-Oriented Attention for BEV-based 3D Object Detection

Jiwei Chen, Yubao Sun, Laiyan Ding, Rui Huang

PDF

Open Access

TL;DR

ROA-BEV introduces a 2D region-oriented attention mechanism for BEV-based 3D object detection, improving focus on object regions and enhancing detection accuracy, especially for objects similar to backgrounds.

Contribution

The paper proposes a novel 2D region-oriented attention module with multi-scale structures for BEV-based 3D detection, improving feature learning and detection performance.

Findings

01

Improved detection accuracy on nuScenes dataset.

02

Enhanced feature learning for large objects.

03

Better focus on object regions in BEV representations.

Abstract

Vision-based Bird's-Eye-View (BEV) 3D object detection has recently become popular in autonomous driving. However, objects with a high similarity to the background from a camera perspective cannot be detected well by existing methods. In this paper, we propose a BEV-based 3D Object Detection Network with 2D Region-Oriented Attention (ROA-BEV), which enables the backbone to focus more on feature learning of the regions where objects exist. Moreover, our method further enhances the information feature learning ability of ROA through multi-scale structures. Each block of ROA utilizes a large kernel to ensure that the receptive field is large enough to catch information about large objects. Experiments on nuScenes show that ROA-BEV improves the performance based on BEVDepth. The source codes of this work will be available at https://github.com/DFLyan/ROA-BEV.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Industrial Vision Systems and Defect Detection

MethodsSoftmax · Attention Is All You Need · Focus