A Study of the Framework and Real-World Applications of Language Embedding for 3D Scene Understanding
Mahmoud Chick Zaouali, Todd Charter, Yehor Karpichev, Brandon Haworth, Homayoun Najjaran

TL;DR
This paper reviews the integration of language embeddings with Gaussian Splatting for 3D scene understanding, highlighting recent advances, challenges, and future research directions in this emerging field.
Contribution
It provides a comprehensive overview of how language guidance is incorporated into Gaussian Splatting for 3D scenes, including theoretical foundations and real-world applications.
Findings
Integration of language embeddings enhances semantic understanding of 3D scenes.
Current methods face computational and generalization challenges.
There is a scarcity of semantically annotated 3D Gaussian data.
Abstract
Gaussian Splatting has rapidly emerged as a transformative technique for real-time 3D scene representation, offering a highly efficient and expressive alternative to Neural Radiance Fields (NeRF). Its ability to render complex scenes with high fidelity has enabled progress across domains such as scene reconstruction, robotics, and interactive content creation. More recently, the integration of Large Language Models (LLMs) and language embeddings into Gaussian Splatting pipelines has opened new possibilities for text-conditioned generation, editing, and semantic scene understanding. Despite these advances, a comprehensive overview of this emerging intersection has been lacking. This survey presents a structured review of current research efforts that combine language guidance with 3D Gaussian Splatting, detailing theoretical foundations, integration strategies, and real-world use cases.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
