TGSFormer: Scalable Temporal Gaussian Splatting for Embodied Semantic Scene Completion

Rui Qian; Haozhi Cao; Tianchen Deng; Tianxin Hu; Weixiang Guo; Shenghai Yuan; Lihua Xie

arXiv:2512.00300·cs.CV·December 2, 2025

TGSFormer: Scalable Temporal Gaussian Splatting for Embodied Semantic Scene Completion

Rui Qian, Haozhi Cao, Tianchen Deng, Tianxin Hu, Weixiang Guo, Shenghai Yuan, Lihua Xie

PDF

Open Access

TL;DR

TGSFormer introduces a scalable, memory-efficient framework for embodied semantic scene completion that leverages persistent Gaussian memory and confidence-aware fusion, achieving state-of-the-art accuracy and scalability.

Contribution

It proposes a novel Temporal Gaussian Splatting method with persistent memory and confidence-aware fusion for improved scalability and accuracy in embodied SSC.

Findings

01

Achieves state-of-the-art results on SSC benchmarks.

02

Uses fewer primitives while maintaining scene integrity.

03

Demonstrates superior scalability and accuracy.

Abstract

Embodied 3D Semantic Scene Completion (SSC) infers dense geometry and semantics from continuous egocentric observations. Most existing Gaussian-based methods rely on random initialization of many primitives within predefined spatial bounds, resulting in redundancy and poor scalability to unbounded scenes. Recent depth-guided approach alleviates this issue but remains local, suffering from latency and memory overhead as scale increases. To overcome these challenges, we propose TGSFormer, a scalable Temporal Gaussian Splatting framework for embodied SSC. It maintains a persistent Gaussian memory for temporal prediction, without relying on image coherence or frame caches. For temporal fusion, a Dual Temporal Encoder jointly processes current and historical Gaussian features through confidence-aware cross-attention. Subsequently, a Confidence-aware Voxel Fusion module merges overlapping…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Advanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis