SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection
Qiu Zhou, Jinming Cao, Hanchao Leng, Yifang Yin, Yu Kun, Roger, Zimmermann

TL;DR
SOGDet introduces a 3D semantic-occupancy branch to enhance multi-view 3D object detection, improving accuracy by incorporating environmental context in autonomous driving scenarios.
Contribution
The paper presents a novel semantic-occupancy guided approach that can be integrated with existing BEV-based methods to improve 3D detection performance.
Findings
Consistently improves baseline methods on nuScenes dataset.
Enhances detection scores such as NDS and mAP.
Leverages environmental context for more robust perception.
Abstract
In the field of autonomous driving, accurate and comprehensive perception of the 3D environment is crucial. Bird's Eye View (BEV) based methods have emerged as a promising solution for 3D object detection using multi-view images as input. However, existing 3D object detection methods often ignore the physical context in the environment, such as sidewalk and vegetation, resulting in sub-optimal performance. In this paper, we propose a novel approach called SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection), that leverages a 3D semantic-occupancy branch to improve the accuracy of 3D object detection. In particular, the physical context modeled by semantic occupancy helps the detector to perceive the scenes in a more holistic view. Our SOGDet is flexible to use and can be seamlessly integrated with most existing BEV-based methods. To evaluate its effectiveness, we apply this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques
