OG-Gaussian: Occupancy Based Street Gaussians for Autonomous Driving

Yedong Shen; Xinran Zhang; Yifan Duan; Shiqi Zhang; Heng Li; Yilong; Wu; Jianmin Ji; Yanyong Zhang

arXiv:2502.14235·cs.CV·February 21, 2025

OG-Gaussian: Occupancy Based Street Gaussians for Autonomous Driving

Yedong Shen, Xinran Zhang, Yifan Duan, Shiqi Zhang, Heng Li, Yilong, Wu, Jianmin Ji, Yanyong Zhang

PDF

Open Access

TL;DR

OG-Gaussian introduces a novel method for 3D scene reconstruction in autonomous driving that uses camera-based occupancy grids and learning-based dynamic object tracking, reducing reliance on expensive sensors and annotations.

Contribution

The paper presents OG-Gaussian, a new approach that replaces LiDAR with camera-derived occupancy grids and employs learning-based methods for dynamic object reconstruction and tracking.

Findings

01

Achieves state-of-the-art reconstruction quality with PSNR of 35.13.

02

Runs at 143 FPS, enabling real-time applications.

03

Reduces computational costs compared to LiDAR-based methods.

Abstract

Accurate and realistic 3D scene reconstruction enables the lifelike creation of autonomous driving simulation environments. With advancements in 3D Gaussian Splatting (3DGS), previous studies have applied it to reconstruct complex dynamic driving scenes. These methods typically require expensive LiDAR sensors and pre-annotated datasets of dynamic objects. To address these challenges, we propose OG-Gaussian, a novel approach that replaces LiDAR point clouds with Occupancy Grids (OGs) generated from surround-view camera images using Occupancy Prediction Network (ONet). Our method leverages the semantic information in OGs to separate dynamic vehicles from static street background, converting these grids into two distinct sets of initial point clouds for reconstructing both static and dynamic objects. Additionally, we estimate the trajectories and poses of dynamic objects through a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTraffic Prediction and Management Techniques · Autonomous Vehicle Technology and Safety · Data Management and Algorithms

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings