4D Primitive-M\^ach\'e: Glueing Primitives for Persistent 4D Scene Reconstruction
Kirill Mazur, Marwan Taher, Andrew J. Davison

TL;DR
This paper introduces a system that reconstructs dynamic 4D scenes from monocular videos by decomposing scenes into moving primitives, enabling persistent, replayable 3D reconstructions over time.
Contribution
It proposes a novel method for joint inference of primitive motions and scene reconstruction, including extrapolation for invisible objects, advancing 4D scene understanding from monocular videos.
Findings
Outperforms existing methods on object scanning datasets
Provides continuous 4D reconstructions with object permanence
Enables replayable 3D scene visualization over time
Abstract
We present a dynamic reconstruction system that receives a casual monocular RGB video as input, and outputs a complete and persistent reconstruction of the scene. In other words, we reconstruct not only the the currently visible parts of the scene, but also all previously viewed parts, which enables replaying the complete reconstruction across all timesteps. Our method decomposes the scene into a set of rigid 3D primitives, which are assumed to be moving throughout the scene. Using estimated dense 2D correspondences, we jointly infer the rigid motion of these primitives through an optimisation pipeline, yielding a 4D reconstruction of the scene, i.e. providing 3D geometry dynamically moving through time. To achieve this, we also introduce a mechanism to extrapolate motion for objects that become invisible, employing motion-grouping techniques to maintain continuity. The resulting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization
