DSOcc: Leveraging Depth Awareness and Semantic Aid to Boost Camera-Based 3D Semantic Occupancy Prediction

Naiyu Fang; Zheyuan Zhou; Kang Wang; Ruibo Li; Lemiao Qiu; Shuyou Zhang; Zhe Wang; Guosheng Lin

arXiv:2505.20951·cs.CV·November 25, 2025

DSOcc: Leveraging Depth Awareness and Semantic Aid to Boost Camera-Based 3D Semantic Occupancy Prediction

Naiyu Fang, Zheyuan Zhou, Kang Wang, Ruibo Li, Lemiao Qiu, Shuyou Zhang, Zhe Wang, Guosheng Lin

PDF

TL;DR

DSOcc introduces a novel approach combining depth awareness and semantic aid to improve camera-based 3D semantic occupancy prediction, achieving state-of-the-art results by jointly inferring occupancy states and classes with robust multi-frame fusion.

Contribution

The paper proposes DSOcc, a method that jointly infers occupancy states and classes using depth and semantic cues, enhancing accuracy and robustness over previous approaches.

Findings

01

Achieves state-of-the-art performance on SemanticKITTI dataset.

02

Demonstrates robustness through multi-frame semantic and occupancy fusion.

03

Outperforms existing camera-based 3D occupancy prediction methods.

Abstract

Camera-based 3D semantic occupancy prediction offers an efficient and cost-effective solution for perceiving surrounding scenes in autonomous driving. However, existing works rely on explicit occupancy state inference, leading to numerous incorrect feature assignments, and insufficient samples restrict the learning of occupancy class inference. To address these challenges, we propose leveraging \textbf{D}epth awareness and \textbf{S}emantic aid to boost camera-based 3D semantic \textbf{Occ}upancy prediction (\textbf{DSOcc}). We jointly perform occupancy state and occupancy class inference, where soft occupancy confidence is calculated by non-learning method and multiplied with image features to make voxels aware of depth, enabling adaptive implicit occupancy state inference. Instead of enhancing feature learning, we directly utilize well-trained image semantic segmentation and fuse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.