Rethinking the State Update Gate for Long-Sequence Recurrent 3D Reconstruction

Kejun Ren; Lei Jin; Tianxin Huang; Lianming Xu; and Li Wang

arXiv:2605.16981·cs.CV·May 19, 2026

Rethinking the State Update Gate for Long-Sequence Recurrent 3D Reconstruction

Kejun Ren, Lei Jin, Tianxin Huang, Lianming Xu, and Li Wang

PDF

TL;DR

This paper introduces a novel frame-level gating mechanism for long-sequence 3D reconstruction that improves accuracy and maintains constant memory without additional training or parameters.

Contribution

It proposes a parameter-free, closed-form frame-level gate derived from feature changes, addressing the structural bottleneck in recurrent 3D reconstruction.

Findings

01

Reduces long-sequence drift by 51% on TUM-RGBD

02

Decreases depth estimation error by 12.8% on Bonn video depth

03

Outperforms existing methods on KITTI long-sequence pose estimation

Abstract

Streaming 3D reconstruction under a strict constant-memory budget hinges on how the recurrent state is updated as the stream evolves. We profile TTT3R-style per-token gates across five benchmarks and discover a structural bottleneck: the gate is intrinsically bounded in magnitude (median $0.31$ ; never exceeding $0.6$ ) and nearly frame-invariant, yielding an effective memory horizon of only $\sim$ 3 frames per state token, which serves as the structural origin of long-sequence drift. We trace this to a missing axis: existing inference-time methods modulate updates only at the per-token, intra-frame level, while the orthogonal frame-level question of \emph{how strongly each frame should contribute to the state} has been treated as content-independent. We close this gap with a scalar frame-level gate $α_{t} \in (0, 1]$ derived in closed form from frame-to-frame changes of internal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.