ODG: Occupancy Prediction Using Dual Gaussians
Yunxiao Shi, Yinhao Zhu, Shizhong Han, Jisoo Jeong, Amin Ansari, Hong Cai, Fatih Porikli

TL;DR
ODG introduces a hierarchical dual Gaussian approach for occupancy prediction in autonomous driving, effectively modeling complex scene dynamics with improved accuracy and efficiency over existing methods.
Contribution
The paper proposes a novel dual Gaussian representation and hierarchical transformer for better scene modeling and occupancy prediction in autonomous driving.
Findings
Sets new state-of-the-art results on Occ3D-nuScenes and Occ3D-Waymo benchmarks.
Achieves high accuracy with low inference cost.
Utilizes Gaussian Splatting for real-time rendering and supervision.
Abstract
Occupancy prediction infers fine-grained 3D geometry and semantics from camera images of the surrounding environment, making it a critical perception task for autonomous driving. Existing methods either adopt dense grids as scene representation, which is difficult to scale to high resolution, or learn the entire scene using a single set of sparse queries, which is insufficient to handle the various object characteristics. In this paper, we present ODG, a hierarchical dual sparse Gaussian representation to effectively capture complex scene dynamics. Building upon the observation that driving scenes can be universally decomposed into static and dynamic counterparts, we define dual Gaussian queries to better model the diverse scene objects. We utilize a hierarchical Gaussian transformer to predict the occupied voxel centers and semantic classes along with the Gaussian parameters.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Computer Graphics and Visualization Techniques
MethodsADaptive gradient method with the OPTimal convergence rate · Sparse Evolutionary Training
