ROA-BEV: 2D Region-Oriented Attention for BEV-based 3D Object Detection
Jiwei Chen, Yubao Sun, Laiyan Ding, Rui Huang

TL;DR
ROA-BEV introduces a 2D region-oriented attention mechanism for BEV-based 3D object detection, improving focus on object regions and enhancing detection accuracy, especially for objects similar to backgrounds.
Contribution
The paper proposes a novel 2D region-oriented attention module with multi-scale structures for BEV-based 3D detection, improving feature learning and detection performance.
Findings
Improved detection accuracy on nuScenes dataset.
Enhanced feature learning for large objects.
Better focus on object regions in BEV representations.
Abstract
Vision-based Bird's-Eye-View (BEV) 3D object detection has recently become popular in autonomous driving. However, objects with a high similarity to the background from a camera perspective cannot be detected well by existing methods. In this paper, we propose a BEV-based 3D Object Detection Network with 2D Region-Oriented Attention (ROA-BEV), which enables the backbone to focus more on feature learning of the regions where objects exist. Moreover, our method further enhances the information feature learning ability of ROA through multi-scale structures. Each block of ROA utilizes a large kernel to ensure that the receptive field is large enough to catch information about large objects. Experiments on nuScenes show that ROA-BEV improves the performance based on BEVDepth. The source codes of this work will be available at https://github.com/DFLyan/ROA-BEV.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Industrial Vision Systems and Defect Detection
MethodsSoftmax · Attention Is All You Need · Focus
