$\textit{S}^3$Gaussian: Self-Supervised Street Gaussians for Autonomous   Driving

Nan Huang; Xiaobao Wei; Wenzhao Zheng; Pengju An; Ming Lu; Wei Zhan,; Masayoshi Tomizuka; Kurt Keutzer; Shanghang Zhang

arXiv:2405.20323·cs.CV·May 31, 2024

$\textit{S}^3$Gaussian: Self-Supervised Street Gaussians for Autonomous Driving

Nan Huang, Xiaobao Wei, Wenzhao Zheng, Pengju An, Ming Lu, Wei Zhan,, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a self-supervised 3D Gaussian method for street scene reconstruction that effectively decomposes static and dynamic elements without requiring costly 3D annotations, improving efficiency for autonomous driving applications.

Contribution

The paper presents $ extit{S}^3$Gaussian, a novel self-supervised approach for decomposing street scenes into static and dynamic parts using 3D Gaussians and a spatial-temporal field network, eliminating the need for tracked 3D vehicle bounding boxes.

Findings

01

Achieves effective static-dynamic decomposition without 3D annotations.

02

Demonstrates superior performance on Waymo-Open dataset.

03

Enables efficient 4D scene reconstruction for autonomous driving.

Abstract

Photorealistic 3D reconstruction of street scenes is a critical technique for developing real-world simulators for autonomous driving. Despite the efficacy of Neural Radiance Fields (NeRF) for driving scenes, 3D Gaussian Splatting (3DGS) emerges as a promising direction due to its faster speed and more explicit representation. However, most existing street 3DGS methods require tracked 3D vehicle bounding boxes to decompose the static and dynamic elements for effective reconstruction, limiting their applications for in-the-wild scenarios. To facilitate efficient 3D scene reconstruction without costly annotations, we propose a self-supervised street Gaussian ( $S^{3}$ Gaussian) method to decompose dynamic and static elements from 4D consistency. We represent each scene with 3D Gaussians to preserve the explicitness and further accompany them with a spatial-temporal field network to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nnanhuang/s3gaussian
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Gaussian Processes and Bayesian Inference · Time Series Analysis and Forecasting

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings