GaussianFormer-2: Probabilistic Gaussian Superposition for Efficient 3D Occupancy Prediction
Yuanhui Huang, Amonnut Thammatadatrakoon, Wenzhao Zheng, Yunpeng, Zhang, Dalong Du, Jiwen Lu

TL;DR
GaussianFormer-2 introduces a probabilistic Gaussian superposition model for efficient 3D occupancy prediction, leveraging sparse scene representations and probabilistic methods to improve accuracy and computational efficiency in autonomous driving scenarios.
Contribution
It proposes a novel probabilistic Gaussian superposition approach with a distribution-based initialization for sparse 3D scene understanding, outperforming existing dense grid methods.
Findings
Achieves state-of-the-art performance on nuScenes and KITTI-360 datasets.
Demonstrates high efficiency in 3D occupancy prediction.
Effectively models sparse scene geometry and semantics.
Abstract
3D semantic occupancy prediction is an important task for robust vision-centric autonomous driving, which predicts fine-grained geometry and semantics of the surrounding scene. Most existing methods leverage dense grid-based scene representations, overlooking the spatial sparsity of the driving scenes. Although 3D semantic Gaussian serves as an object-centric sparse alternative, most of the Gaussians still describe the empty region with low efficiency. To address this, we propose a probabilistic Gaussian superposition model which interprets each Gaussian as a probability distribution of its neighborhood being occupied and conforms to probabilistic multiplication to derive the overall geometry. Furthermore, we adopt the exact Gaussian mixture model for semantics calculation to avoid unnecessary overlapping of Gaussians. To effectively initialize Gaussians in non-empty region, we design a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods · 3D Shape Modeling and Analysis
MethodsADaptive gradient method with the OPTimal convergence rate
