GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs
Xingrui Wang, Cuiling Lan, Hanxin Zhu, Zhibo Chen, Yan Lu

TL;DR
GSemSplat introduces a novel framework for generalizable 3D semantic modeling from sparse, uncalibrated image pairs, eliminating the need for dense scene-specific optimization and calibration, thus broadening practical applications.
Contribution
It proposes GSemSplat, a method that learns open-vocabulary semantic 3D representations linked to Gaussian primitives without per-scene optimization or calibration, leveraging dual-feature supervision.
Findings
Outperforms traditional scene-specific methods on ScanNet++
Effectively learns semantic features from sparse, uncalibrated images
Demonstrates generalization across diverse scenes
Abstract
Modeling and understanding the 3D world is crucial for various applications, from augmented reality to robotic navigation. Recent advancements based on 3D Gaussian Splatting have integrated semantic information from multi-view images into Gaussian primitives. However, these methods typically require costly per-scene optimization from dense calibrated images, limiting their practicality. In this paper, we consider the new task of generalizable 3D semantic field modeling from sparse, uncalibrated image pairs. Building upon the Splatt3R architecture, we introduce GSemSplat, a framework that learns open-vocabulary semantic representations linked to 3D Gaussians without the need for per-scene optimization, dense image collections or calibration. To ensure effective and reliable learning of semantic features in 3D space, we employ a dual-feature approach that leverages both region-specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Surveying and Cultural Heritage · Advanced Neural Network Applications · Image Processing and 3D Reconstruction
