Gaussian2Scene: 3D Scene Representation Learning via Self-supervised Learning with 3D Gaussian Splatting

Keyi Liu; Weidong Yang; Ben Fei; Ying He

arXiv:2506.08777·cs.CV·June 12, 2025

Gaussian2Scene: 3D Scene Representation Learning via Self-supervised Learning with 3D Gaussian Splatting

Keyi Liu, Weidong Yang, Ben Fei, Ying He

PDF

Open Access

TL;DR

Gaussian2Scene introduces a self-supervised learning framework for 3D scene understanding that leverages 3D Gaussian Splatting for efficient, explicit scene representation and improved geometric reconstruction, outperforming existing methods.

Contribution

The paper presents a novel SSL framework using 3D Gaussian Splatting for explicit scene modeling, reducing computational costs and enhancing 3D geometric understanding.

Findings

01

Improves 3D object detection performance over existing pre-training methods.

02

Supports direct 3D scene reconstruction with Gaussian primitives.

03

Reduces memory and computational demands compared to volume rendering.

Abstract

Self-supervised learning (SSL) for point cloud pre-training has become a cornerstone for many 3D vision tasks, enabling effective learning from large-scale unannotated data. At the scene level, existing SSL methods often incorporate volume rendering into the pre-training framework, using RGB-D images as reconstruction signals to facilitate cross-modal learning. This strategy promotes alignment between 2D and 3D modalities and enables the model to benefit from rich visual cues in the RGB-D inputs. However, these approaches are limited by their reliance on implicit scene representations and high memory demands. Furthermore, since their reconstruction objectives are applied only in 2D space, they often fail to capture underlying 3D geometric structures. To address these challenges, we propose Gaussian2Scene, a novel scene-level SSL framework that leverages the efficiency and explicit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Robot Manipulation and Learning · Domain Adaptation and Few-Shot Learning