Towards 3D Object-Centric Feature Learning for Semantic Scene Completion

Weihua Wang; Yubo Cui; Xiangru Lin; Zhiheng Li; Zheng Fang

arXiv:2511.13031·cs.CV·December 23, 2025

Towards 3D Object-Centric Feature Learning for Semantic Scene Completion

Weihua Wang, Yubo Cui, Xiangru Lin, Zhiheng Li, Zheng Fang

PDF

Open Access 1 Video

TL;DR

This paper introduces Ocean, an object-centric framework for 3D semantic scene completion that improves accuracy by focusing on individual object instances and leveraging attention mechanisms, outperforming existing methods.

Contribution

The paper proposes a novel object-centric prediction framework with attention modules and diffusion processes, enhancing semantic scene completion accuracy over ego-centric approaches.

Findings

01

Achieves state-of-the-art mIoU scores of 17.40 on SemanticKITTI

02

Achieves state-of-the-art mIoU scores of 20.28 on SSCBench-KITTI360

03

Demonstrates significant performance improvements over existing methods

Abstract

Vision-based 3D Semantic Scene Completion (SSC) has received growing attention due to its potential in autonomous driving. While most existing approaches follow an ego-centric paradigm by aggregating and diffusing features over the entire scene, they often overlook fine-grained object-level details, leading to semantic and geometric ambiguities, especially in complex environments. To address this limitation, we propose Ocean, an object-centric prediction framework that decomposes the scene into individual object instances to enable more accurate semantic occupancy prediction. Specifically, we first employ a lightweight segmentation model, MobileSAM, to extract instance masks from the input image. Then, we introduce a 3D Semantic Group Attention module that leverages linear attention to aggregate object-centric features in 3D space. To handle segmentation errors and missing instances, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Towards 3D Object-Centric Feature Learning for Semantic Scene Completion· underline

Taxonomy

Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Robotics and Sensor-Based Localization