OVGGT: O(1) Constant-Cost Streaming Visual Geometry Transformer

Si-Yu Lu; Po-Ting Chen; Hui-Che Hsu; Sin-Ye Jhong; Wen-Huang Cheng; Yung-Yao Chen

arXiv:2603.05959·cs.CV·April 30, 2026

OVGGT: O(1) Constant-Cost Streaming Visual Geometry Transformer

Si-Yu Lu, Po-Ting Chen, Hui-Che Hsu, Sin-Ye Jhong, Wen-Huang Cheng, Yung-Yao Chen

PDF

1 Repo

TL;DR

OVGGT introduces a resource-efficient, training-free framework for 3D reconstruction from streaming video, maintaining fixed memory and compute costs regardless of sequence length while achieving high accuracy.

Contribution

It combines Self-Selective Caching and Dynamic Anchor Protection to enable long-horizon streaming inference with constant resource usage, surpassing prior methods.

Findings

01

Processes arbitrarily long videos within fixed VRAM.

02

Achieves state-of-the-art 3D geometric accuracy.

03

Supports indoor, outdoor, and ultra-long sequences.

Abstract

Reconstructing 3D geometry from streaming video requires continuous inference under bounded resources. Recent geometric foundation models achieve impressive reconstruction quality through all-to-all attention, yet their quadratic cost confines them to short, offline sequences. Causal-attention variants such as StreamVGGT enable single-pass streaming but accumulate an ever-growing KV cache, exhausting GPU memory within hundreds of frames and precluding the long-horizon deployment that motivates streaming inference in the first place. We present OVGGT, a training-free framework that bounds both memory and compute to a fixed budget regardless of sequence length. Our approach combines Self-Selective Caching, which leverages FFN residual magnitudes to compress the KV cache while remaining fully compatible with FlashAttention, with Dynamic Anchor Protection, which shields coordinate-critical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

VAISR/OVGGT
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.