EGSRAL: An Enhanced 3D Gaussian Splatting based Renderer with Automated   Labeling for Large-Scale Driving Scene

Yixiong Huo; Guangfeng Jiang; Hongyang Wei; Ji Liu; Song Zhang; Han; Liu; Xingliang Huang; Mingjie Lu; Jinzhang Peng; Dong Li; Lu Tian; Emad; Barsoum

arXiv:2412.15550·cs.CV·December 23, 2024

EGSRAL: An Enhanced 3D Gaussian Splatting based Renderer with Automated Labeling for Large-Scale Driving Scene

Yixiong Huo, Guangfeng Jiang, Hongyang Wei, Ji Liu, Song Zhang, Han, Liu, Xingliang Huang, Mingjie Lu, Jinzhang Peng, Dong Li, Lu Tian, Emad, Barsoum

PDF

Open Access 1 Repo

TL;DR

EGSRAL is a novel 3D Gaussian Splatting based renderer that automatically labels large-scale driving scenes using only training images, improving scene modeling and detection tasks without extra annotations.

Contribution

It introduces a new method that models dynamic and static scene elements with automated labeling, eliminating the need for additional annotation data.

Findings

01

Achieves state-of-the-art PSNR of 29.04 on nuScenes.

02

Automatically generates annotations improving detection performance.

03

Effectively models complex large-scale scenes.

Abstract

3D Gaussian Splatting (3D GS) has gained popularity due to its faster rendering speed and high-quality novel view synthesis. Some researchers have explored using 3D GS for reconstructing driving scenes. However, these methods often rely on various data types, such as depth maps, 3D boxes, and trajectories of moving objects. Additionally, the lack of annotations for synthesized images limits their direct application in downstream tasks. To address these issues, we propose EGSRAL, a 3D GS-based method that relies solely on training images without extra annotations. EGSRAL enhances 3D GS's capability to model both dynamic objects and static backgrounds and introduces a novel adaptor for auto labeling, generating corresponding annotations based on existing annotations. We also propose a grouping strategy for vanilla 3D GS to address perspective issues in rendering large-scale, complex…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiangxb98/egsral
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputer Graphics and Visualization Techniques · Remote Sensing and LiDAR Applications · Autonomous Vehicle Technology and Safety

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings