LongDPM: Overlap-Aware 4D Reconstruction from Long Monocular Videos
Chenyi Xu, Yihao Wu, Liqi Yan, Chao Yang, Jianhui Zhang, Fangli Guan, Pan Li

TL;DR
LongDPM introduces an overlap-aware framework for scalable, long-range 4D scene reconstruction from monocular videos, effectively connecting local predictions across chunks for coherent dynamic 3D modeling.
Contribution
It proposes a novel method that processes long videos in overlapping chunks, connecting local reconstructions into a unified long-range dynamic scene.
Findings
Reduces dense tracking EPE over V-DPM on PointOdyssey, Kubric-F, and Kubric-G datasets.
Achieves the best TUM-dynamics ATE for camera pose estimation.
Demonstrates superior long-range reconstruction and tracking performance.
Abstract
Recovering a dynamic 3D scene from a long monocular video is crucial for dense geometry, camera motion, and temporal correspondence to remain consistent in a shared coordinate system. Existing methods face two key challenges: (1) feed-forward reconstruction models provide accurate local predictions but are limited to short clips, and (2) long-range trackers preserve correspondences without producing dense sequence-level reconstruction. This paper presents LongDPM, a novel overlap-aware framework for scalable long-range monocular dynamic reconstruction. First, LongDPM processes long videos in overlapping chunks, keeping inference memory bounded by the chunk length. Second, it connects chunk-local coordinate systems through confidence-weighted registration with static-aware overlap abstraction. Third, it associates dynamic identities across chunk boundaries and fuses matched trajectories…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
