SpatialSplat: Efficient Semantic 3D from Sparse Unposed Images

Yu Sheng; Jiajun Deng; Xinran Zhang; Yu Zhang; Bei Hua; Yanyong Zhang; Jianmin Ji

arXiv:2505.23044·cs.CV·October 13, 2025

SpatialSplat: Efficient Semantic 3D from Sparse Unposed Images

Yu Sheng, Jiajun Deng, Xinran Zhang, Yu Zhang, Bei Hua, Yanyong Zhang, Jianmin Ji

PDF

Open Access

TL;DR

SpatialSplat introduces a novel, efficient semantic 3D reconstruction method that reduces memory usage by 60% while maintaining high accuracy, leveraging dual-field semantic representation and selective Gaussian mechanisms.

Contribution

It proposes a dual-field semantic representation and selective Gaussian mechanism to improve semantic 3D reconstruction efficiency and detail over prior methods.

Findings

01

Achieves 60% reduction in scene parameters.

02

Outperforms state-of-the-art methods in accuracy.

03

Effectively captures fine-grained semantics with fewer primitives.

Abstract

A major breakthrough in 3D reconstruction is the feedforward paradigm to generate pixel-wise 3D points or Gaussian primitives from sparse, unposed images. To further incorporate semantics while avoiding the significant memory and storage costs of high-dimensional semantic features, existing methods extend this paradigm by associating each primitive with a compressed semantic feature vector. However, these methods have two major limitations: (a) the naively compressed feature compromises expressiveness, affecting the model's ability to capture fine-grained semantics, and (b) the pixel-wise primitive prediction introduces redundancy in overlapping areas, causing unnecessary memory overhead. To this end, we introduce \textbf{SpatialSplat}, a feedforward framework that produces redundancy-aware Gaussians and capitalizes on a dual-field semantic representation. Particularly, with the insight…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Advanced Vision and Imaging · Robotics and Sensor-Based Localization