TTSA3R: Training-Free Temporal-Spatial Adaptive Persistent State for Streaming 3D Reconstruction
Zhijie Zheng, Xinhao Xiang, Jiawei Zhang

TL;DR
TTSA3R is a training-free framework that improves long-term 3D reconstruction by adaptively updating persistent states using temporal and spatial cues, reducing catastrophic forgetting over extended sequences.
Contribution
It introduces a novel training-free approach with modules for temporal and spatial adaptive updates, enhancing long-term stability in streaming 3D reconstruction.
Findings
Significantly reduces error increase on extended sequences
Outperforms baseline models in long-term reconstruction stability
Effective across diverse 3D reconstruction tasks
Abstract
Streaming recurrent models enable efficient 3D reconstruction by maintaining persistent state representations. However, they suffer from catastrophic forgetting over long sequences due to balancing historical information with new observations. Recent methods alleviate this by deriving adaptive signals from attention perspective, but they operate on single dimensions without considering temporal and spatial consistency. To this end, we propose a training-free framework termed TTSA3R that leverages both temporal state evolution and spatial observation quality for adaptive state updates in 3D reconstruction. In particular, we devise a Temporal Adaptive Update Module that regulates update magnitude by analyzing temporal state evolution patterns. Then, a Spatial Contextual Update Module is introduced to localize spatial regions that require updates through observation-state alignment and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization · Advanced Vision and Imaging
