Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving
Xiaoyu Tian, Tao Jiang, Longfei Yun, Yucheng Mao, Huitong Yang, Yue, Wang, Yilun Wang, Hang Zhao

TL;DR
This paper introduces Occ3D, a large-scale benchmark for 3D occupancy prediction in autonomous driving, along with a new dataset, label generation pipeline, and a novel Coarse-to-Fine Occupancy network that outperforms baselines.
Contribution
It provides the first large-scale 3D occupancy prediction benchmarks based on Waymo and nuScenes datasets, a comprehensive label generation pipeline, and a new model demonstrating superior performance.
Findings
The Occ3D benchmarks enable standardized evaluation of 3D occupancy prediction.
The label generation pipeline produces dense, visibility-aware labels for scenes.
The CTF-Occ model achieves state-of-the-art results on the benchmarks.
Abstract
Robotic perception requires the modeling of both 3D geometry and semantics. Existing methods typically focus on estimating 3D bounding boxes, neglecting finer geometric details and struggling to handle general, out-of-vocabulary objects. 3D occupancy prediction, which estimates the detailed occupancy states and semantics of a scene, is an emerging task to overcome these limitations. To support 3D occupancy prediction, we develop a label generation pipeline that produces dense, visibility-aware labels for any given scene. This pipeline comprises three stages: voxel densification, occlusion reasoning, and image-guided voxel refinement. We establish two benchmarks, derived from the Waymo Open Dataset and the nuScenes Dataset, namely Occ3D-Waymo and Occ3D-nuScenes benchmarks. Furthermore, we provide an extensive analysis of the proposed dataset with various baseline models. Lastly, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Medical Image Segmentation Techniques
