4D Primitive-M\^ach\'e: Glueing Primitives for Persistent 4D Scene Reconstruction

Kirill Mazur; Marwan Taher; Andrew J. Davison

arXiv:2512.16564·cs.CV·December 19, 2025

4D Primitive-M\^ach\'e: Glueing Primitives for Persistent 4D Scene Reconstruction

Kirill Mazur, Marwan Taher, Andrew J. Davison

PDF

Open Access

TL;DR

This paper introduces a system that reconstructs dynamic 4D scenes from monocular videos by decomposing scenes into moving primitives, enabling persistent, replayable 3D reconstructions over time.

Contribution

It proposes a novel method for joint inference of primitive motions and scene reconstruction, including extrapolation for invisible objects, advancing 4D scene understanding from monocular videos.

Findings

01

Outperforms existing methods on object scanning datasets

02

Provides continuous 4D reconstructions with object permanence

03

Enables replayable 3D scene visualization over time

Abstract

We present a dynamic reconstruction system that receives a casual monocular RGB video as input, and outputs a complete and persistent reconstruction of the scene. In other words, we reconstruct not only the the currently visible parts of the scene, but also all previously viewed parts, which enables replaying the complete reconstruction across all timesteps. Our method decomposes the scene into a set of rigid 3D primitives, which are assumed to be moving throughout the scene. Using estimated dense 2D correspondences, we jointly infer the rigid motion of these primitives through an optimisation pipeline, yielding a 4D reconstruction of the scene, i.e. providing 3D geometry dynamically moving through time. To achieve this, we also introduce a mechanism to extrapolate motion for objects that become invisible, employing motion-grouping techniques to maintain continuity. The resulting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization