SliceOcc: Indoor 3D Semantic Occupancy Prediction with Vertical Slice   Representation

Jianing Li; Ming Lu; Hao Wang; Chenyang Gu; Wenzhao Zheng; Li Du,; Shanghang Zhang

arXiv:2501.16684·cs.CV·January 29, 2025

SliceOcc: Indoor 3D Semantic Occupancy Prediction with Vertical Slice Representation

Jianing Li, Ming Lu, Hao Wang, Chenyang Gu, Wenzhao Zheng, Li Du,, Shanghang Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces SliceOcc, a novel vertical slice representation and model for indoor 3D semantic occupancy prediction from RGB images, outperforming existing planar-based methods especially in occluded dense environments.

Contribution

The paper proposes a new vertical slice scene representation and a tailored RGB-based model, SliceOcc, for improved indoor 3D semantic occupancy prediction.

Findings

01

Achieves a 15.45% mIoU on EmbodiedScan dataset

02

Sets a new state-of-the-art among RGB camera-based models

03

Effective in dense indoor environments with occlusions

Abstract

3D semantic occupancy prediction is a crucial task in visual perception, as it requires the simultaneous comprehension of both scene geometry and semantics. It plays a crucial role in understanding 3D scenes and has great potential for various applications, such as robotic vision perception and autonomous driving. Many existing works utilize planar-based representations such as Bird's Eye View (BEV) and Tri-Perspective View (TPV). These representations aim to simplify the complexity of 3D scenes while preserving essential object information, thereby facilitating efficient scene representation. However, in dense indoor environments with prevalent occlusions, directly applying these planar-based methods often leads to difficulties in capturing global semantic occupancy, ultimately degrading model performance. In this paper, we present a new vertical slice representation that divides the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

northsummer/sliceocc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Video Analysis and Summarization