OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous   Driving

Guoqing Wang; Zhongdao Wang; Pin Tang; Jilai Zheng; Xiangxuan Ren,; Bailan Feng; Chao Ma

arXiv:2404.15014·cs.CV·April 24, 2024·1 cites

OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving

Guoqing Wang, Zhongdao Wang, Pin Tang, Jilai Zheng, Xiangxuan Ren,, Bailan Feng, Chao Ma

PDF

Open Access

TL;DR

OccGen is a generative model for 3D semantic occupancy prediction in autonomous driving that refines predictions through a diffusion process, outperforming existing methods and providing uncertainty estimates.

Contribution

It introduces a novel generative diffusion-based approach for 3D occupancy prediction, enabling progressive refinement and scene imagination capabilities.

Findings

01

Improves mIoU by up to 13.3% on nuScenes-Occupancy datasets.

02

Outperforms state-of-the-art discriminative methods.

03

Provides uncertainty estimates alongside predictions.

Abstract

Existing solutions for 3D semantic occupancy prediction typically treat the task as a one-shot 3D voxel-wise segmentation perception problem. These discriminative methods focus on learning the mapping between the inputs and occupancy map in a single step, lacking the ability to gradually refine the occupancy map and the reasonable scene imaginative capacity to complete the local regions somewhere. In this paper, we introduce OccGen, a simple yet powerful generative perception model for the task of 3D semantic occupancy prediction. OccGen adopts a ''noise-to-occupancy'' generative paradigm, progressively inferring and refining the occupancy map by predicting and eliminating noise originating from a random Gaussian distribution. OccGen consists of two main components: a conditional encoder that is capable of processing multi-modal inputs, and a progressive refinement decoder that applies…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Autonomous Vehicle Technology and Safety · Automated Road and Building Extraction

MethodsFocus · Diffusion