Gau-Occ: Geometry-Completed Gaussians for Multi-Modal 3D Occupancy Prediction

Chengxin Lv; Yihui Li; Hongyu Yang; YunHong Wang

arXiv:2603.22852·cs.CV·March 25, 2026

Gau-Occ: Geometry-Completed Gaussians for Multi-Modal 3D Occupancy Prediction

Chengxin Lv, Yihui Li, Hongyu Yang, YunHong Wang

PDF

Open Access

TL;DR

Gau-Occ introduces a novel multi-modal 3D occupancy prediction framework that models scenes with semantic 3D Gaussians, leveraging LiDAR completion and efficient multi-view fusion to achieve high accuracy with reduced computational cost.

Contribution

The paper proposes Gau-Occ, a compact Gaussian-based scene representation with LiDAR completion and geometry-aligned multi-view fusion, advancing efficiency and accuracy in 3D occupancy prediction.

Findings

01

Achieves state-of-the-art performance on benchmarks.

02

Significantly reduces computational complexity.

03

Effectively recovers missing structures from sparse LiDAR.

Abstract

3D semantic occupancy prediction is crucial for autonomous driving. While multi-modal fusion improves accuracy over vision-only methods, it typically relies on computationally expensive dense voxel or BEV tensors. We present Gau-Occ, a multi-modal framework that bypasses dense volumetric processing by modeling the scene as a compact collection of semantic 3D Gaussians. To ensure geometric completeness, we propose a LiDAR Completion Diffuser (LCD) that recovers missing structures from sparse LiDAR to initialize robust Gaussian anchors. Furthermore, we introduce Gaussian Anchor Fusion (GAF), which efficiently integrates multi-view image semantics via geometry-aligned 2D sampling and cross-modal alignment. By refining these compact Gaussian descriptors, Gau-Occ captures both spatial consistency and semantic discriminability. Extensive experiments across challenging benchmarks demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization · Advanced Vision and Imaging