SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature for Multi-view 3D Object Detection
Jinqing Zhang, Yanan Zhang, Qingjie Liu, Yunhong Wang

TL;DR
SA-BEV introduces a semantic-aware approach for multi-view 3D object detection that filters background noise, enhances feature quality, and achieves state-of-the-art results in autonomous driving scenarios.
Contribution
The paper presents a novel semantic-aware BEV pooling method, a data augmentation strategy, and a multi-scale cross-task head, advancing camera-based 3D detection accuracy.
Findings
Achieves state-of-the-art performance on nuScenes dataset.
Effectively filters background information to improve object detection.
Enhances depth and semantic segmentation accuracy.
Abstract
Recently, the pure camera-based Bird's-Eye-View (BEV) perception provides a feasible solution for economical autonomous driving. However, the existing BEV-based multi-view 3D detectors generally transform all image features into BEV features, without considering the problem that the large proportion of background information may submerge the object information. In this paper, we propose Semantic-Aware BEV Pooling (SA-BEVPool), which can filter out background information according to the semantic segmentation of image features and transform image features into semantic-aware BEV features. Accordingly, we propose BEV-Paste, an effective data augmentation strategy that closely matches with semantic-aware BEV feature. In addition, we design a Multi-Scale Cross-Task (MSCT) head, which combines task-specific and cross-task information to predict depth distribution and semantic segmentation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature for Multi-view 3D Object Detection· youtube
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection · Advanced Neural Network Applications
