LONG3R: Long Sequence Streaming 3D Reconstruction
Zhuoguang Chen, Minghui Qin, Tianyuan Yuan, Zhe Liu, Hang Zhao

TL;DR
LONG3R introduces a real-time streaming 3D reconstruction model capable of handling long sequences by using a recurrent architecture with dynamic memory management, significantly improving performance over existing methods.
Contribution
The paper presents a novel recurrent model with a 3D spatio-temporal memory and a curriculum training strategy for efficient long-sequence 3D scene reconstruction.
Findings
Outperforms state-of-the-art streaming methods on long sequences
Maintains real-time inference speed
Effectively captures long-term scene information
Abstract
Recent advancements in multi-view scene reconstruction have been significant, yet existing methods face limitations when processing streams of input images. These methods either rely on time-consuming offline optimization or are restricted to shorter sequences, hindering their applicability in real-time scenarios. In this work, we propose LONG3R (LOng sequence streaming 3D Reconstruction), a novel model designed for streaming multi-view 3D scene reconstruction over longer sequences. Our model achieves real-time processing by operating recurrently, maintaining and updating memory with each new observation. We first employ a memory gating mechanism to filter relevant memory, which, together with a new observation, is fed into a dual-source refined decoder for coarse-to-fine interaction. To effectively capture long-sequence memory, we propose a 3D spatio-temporal memory that dynamically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · Medical Imaging Techniques and Applications
