GSemSplat: Generalizable Semantic 3D Gaussian Splatting from   Uncalibrated Image Pairs

Xingrui Wang; Cuiling Lan; Hanxin Zhu; Zhibo Chen; Yan Lu

arXiv:2412.16932·cs.CV·December 24, 2024

GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs

Xingrui Wang, Cuiling Lan, Hanxin Zhu, Zhibo Chen, Yan Lu

PDF

Open Access 1 Repo

TL;DR

GSemSplat introduces a novel framework for generalizable 3D semantic modeling from sparse, uncalibrated image pairs, eliminating the need for dense scene-specific optimization and calibration, thus broadening practical applications.

Contribution

It proposes GSemSplat, a method that learns open-vocabulary semantic 3D representations linked to Gaussian primitives without per-scene optimization or calibration, leveraging dual-feature supervision.

Findings

01

Outperforms traditional scene-specific methods on ScanNet++

02

Effectively learns semantic features from sparse, uncalibrated images

03

Demonstrates generalization across diverse scenes

Abstract

Modeling and understanding the 3D world is crucial for various applications, from augmented reality to robotic navigation. Recent advancements based on 3D Gaussian Splatting have integrated semantic information from multi-view images into Gaussian primitives. However, these methods typically require costly per-scene optimization from dense calibrated images, limiting their practicality. In this paper, we consider the new task of generalizable 3D semantic field modeling from sparse, uncalibrated image pairs. Building upon the Splatt3R architecture, we introduce GSemSplat, a framework that learns open-vocabulary semantic representations linked to 3D Gaussians without the need for per-scene optimization, dense image collections or calibration. To ensure effective and reliable learning of semantic features in 3D space, we employ a dual-feature approach that leverages both region-specific…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wxrui182/GSemSplat
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Surveying and Cultural Heritage · Advanced Neural Network Applications · Image Processing and 3D Reconstruction