GaussianCross: Cross-modal Self-supervised 3D Representation Learning via Gaussian Splatting
Lei Yao, Yi Wang, Yi Zhang, Moyun Liu, Lap-Pui Chau

TL;DR
GaussianCross introduces a novel cross-modal self-supervised learning framework for 3D scene understanding, utilizing Gaussian splatting and adaptive distillation to improve robustness, efficiency, and generalization in 3D representations.
Contribution
The paper proposes GaussianCross, a new architecture that converts 3D point clouds into Gaussian representations and employs a tri-attribute distillation module for enhanced self-supervised 3D learning.
Findings
Achieves superior performance with minimal parameters and data.
Demonstrates strong generalization on multiple benchmarks.
Improves segmentation accuracy significantly over existing methods.
Abstract
The significance of informative and robust point representations has been widely acknowledged for 3D scene understanding. Despite existing self-supervised pre-training counterparts demonstrating promising performance, the model collapse and structural information deficiency remain prevalent due to insufficient point discrimination difficulty, yielding unreliable expressions and suboptimal performance. In this paper, we present GaussianCross, a novel cross-modal self-supervised 3D representation learning architecture integrating feed-forward 3D Gaussian Splatting (3DGS) techniques to address current challenges. GaussianCross seamlessly converts scale-inconsistent 3D point clouds into a unified cuboid-normalized Gaussian representation without missing details, enabling stable and generalizable pre-training. Subsequently, a tri-attribute adaptive distillation splatting module is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization · Advanced Neural Network Applications
