Gau-Occ: Geometry-Completed Gaussians for Multi-Modal 3D Occupancy Prediction
Chengxin Lv, Yihui Li, Hongyu Yang, YunHong Wang

TL;DR
Gau-Occ introduces a novel multi-modal 3D occupancy prediction framework that models scenes with semantic 3D Gaussians, leveraging LiDAR completion and efficient multi-view fusion to achieve high accuracy with reduced computational cost.
Contribution
The paper proposes Gau-Occ, a compact Gaussian-based scene representation with LiDAR completion and geometry-aligned multi-view fusion, advancing efficiency and accuracy in 3D occupancy prediction.
Findings
Achieves state-of-the-art performance on benchmarks.
Significantly reduces computational complexity.
Effectively recovers missing structures from sparse LiDAR.
Abstract
3D semantic occupancy prediction is crucial for autonomous driving. While multi-modal fusion improves accuracy over vision-only methods, it typically relies on computationally expensive dense voxel or BEV tensors. We present Gau-Occ, a multi-modal framework that bypasses dense volumetric processing by modeling the scene as a compact collection of semantic 3D Gaussians. To ensure geometric completeness, we propose a LiDAR Completion Diffuser (LCD) that recovers missing structures from sparse LiDAR to initialize robust Gaussian anchors. Furthermore, we introduce Gaussian Anchor Fusion (GAF), which efficiently integrates multi-view image semantics via geometry-aligned 2D sampling and cross-modal alignment. By refining these compact Gaussian descriptors, Gau-Occ captures both spatial consistency and semantic discriminability. Extensive experiments across challenging benchmarks demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization · Advanced Vision and Imaging
