SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection

Qiu Zhou; Jinming Cao; Hanchao Leng; Yifang Yin; Yu Kun; Roger; Zimmermann

arXiv:2308.13794·cs.CV·January 9, 2024·2 cites

SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection

Qiu Zhou, Jinming Cao, Hanchao Leng, Yifang Yin, Yu Kun, Roger, Zimmermann

PDF

Open Access 1 Repo

TL;DR

SOGDet introduces a 3D semantic-occupancy branch to enhance multi-view 3D object detection, improving accuracy by incorporating environmental context in autonomous driving scenarios.

Contribution

The paper presents a novel semantic-occupancy guided approach that can be integrated with existing BEV-based methods to improve 3D detection performance.

Findings

01

Consistently improves baseline methods on nuScenes dataset.

02

Enhances detection scores such as NDS and mAP.

03

Leverages environmental context for more robust perception.

Abstract

In the field of autonomous driving, accurate and comprehensive perception of the 3D environment is crucial. Bird's Eye View (BEV) based methods have emerged as a promising solution for 3D object detection using multi-view images as input. However, existing 3D object detection methods often ignore the physical context in the environment, such as sidewalk and vegetation, resulting in sub-optimal performance. In this paper, we propose a novel approach called SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection), that leverages a 3D semantic-occupancy branch to improve the accuracy of 3D object detection. In particular, the physical context modeled by semantic occupancy helps the detector to perceive the scenes in a more holistic view. Our SOGDet is flexible to use and can be seamlessly integrated with most existing BEV-based methods. To evaluate its effectiveness, we apply this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhouqiu/sogdet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques