SLGaussian: Fast Language Gaussian Splatting in Sparse Views

Kangjie Chen; BingQuan Dai; Minghan Qin; Dongbin Zhang; Peihao Li; Yingshuang Zou; Haoqian Wang

arXiv:2412.08331·cs.CV·August 19, 2025

SLGaussian: Fast Language Gaussian Splatting in Sparse Views

Kangjie Chen, BingQuan Dai, Minghan Qin, Dongbin Zhang, Peihao Li, Yingshuang Zou, Haoqian Wang

PDF

Open Access

TL;DR

SLGaussian is a fast, feed-forward method that constructs 3D semantic fields from sparse viewpoints, enabling efficient and accurate 3D scene understanding with language integration, outperforming existing approaches.

Contribution

It introduces SLGaussian, a novel approach that embeds language into 3D space efficiently from sparse views, avoiding costly per-scene optimization.

Findings

01

Outperforms existing methods in IoU, localization, and mIoU.

02

Scene inference time is under 30 seconds.

03

Open-vocabulary querying takes only 0.011 seconds per query.

Abstract

3D semantic field learning is crucial for applications like autonomous navigation, AR/VR, and robotics, where accurate comprehension of 3D scenes from limited viewpoints is essential. Existing methods struggle under sparse view conditions, relying on inefficient per-scene multi-view optimizations, which are impractical for many real-world tasks. To address this, we propose SLGaussian, a feed-forward method for constructing 3D semantic fields from sparse viewpoints, allowing direct inference of 3DGS-based scenes. By ensuring consistent SAM segmentations through video tracking and using low-dimensional indexing for high-dimensional CLIP features, SLGaussian efficiently embeds language information in 3D space, offering a robust solution for accurate 3D scene understanding under sparse view conditions. In experiments on two-view sparse 3D object querying and segmentation in the LERF and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Speech Recognition and Synthesis · Machine Learning and Data Classification

MethodsSegment Anything Model · Contrastive Language-Image Pre-training