Language Driven Occupancy Prediction
Zhu Yu, Bowen Pang, Lizhe Liu, Runmin Zhang, Qiang Li, Si-Yuan Cao, Maochun Luo, Mingxia Chen, Sheng Yang, Hui-Liang Shen

TL;DR
LOcc introduces a semantic transitive labeling pipeline for open-vocabulary 3D occupancy prediction, leveraging image-to-text transfer to improve accuracy and reduce manual labeling, and demonstrates superior performance over existing methods.
Contribution
The paper presents a novel semantic transitive labeling pipeline and a new LOcc framework that enhances open-vocabulary 3D occupancy prediction with dense language supervision.
Findings
LOcc outperforms state-of-the-art zero-shot methods on Occ3D-nuScenes.
Semantic transitive labeling improves pseudo-label accuracy.
LOcc reduces reliance on manual annotations.
Abstract
We introduce LOcc, an effective and generalizable framework for open-vocabulary occupancy (OVO) prediction. Previous approaches typically supervise the networks through coarse voxel-to-text correspondences via image features as intermediates or noisy and sparse correspondences from voxel-based model-view projections. To alleviate the inaccurate supervision, we propose a semantic transitive labeling pipeline to generate dense and fine-grained 3D language occupancy ground truth. Our pipeline presents a feasible way to dig into the valuable semantic information of images, transferring text labels from images to LiDAR point clouds and ultimately to voxels, to establish precise voxel-to-text correspondences. By replacing the original prediction head of supervised occupancy models with a geometry head for binary occupancy states and a language head for language features, LOcc effectively uses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Topic Modeling · Web Data Mining and Analysis
