MonoFusion: Sparse-View 4D Reconstruction via Monocular Fusion
Zihan Wang, Jeff Tan, Tarasha Khurana, Neehar Peri, Deva Ramanan

TL;DR
MonoFusion introduces a novel approach for dynamic 4D scene reconstruction from sparse-view monocular videos, effectively aligning independent monocular reconstructions to produce consistent, high-quality results in challenging sparse-view scenarios.
Contribution
The paper presents a new method for reconstructing dynamic scenes from few cameras by aligning monocular reconstructions, overcoming limitations of dense multi-view methods in sparse-view setups.
Findings
Outperforms prior methods in sparse-view scenarios
Achieves higher quality and more consistent reconstructions
Effective in rendering novel views from limited cameras
Abstract
We address the problem of dynamic scene reconstruction from sparse-view videos. Prior work often requires dense multi-view captures with hundreds of calibrated cameras (e.g. Panoptic Studio). Such multi-view setups are prohibitively expensive to build and cannot capture diverse scenes in-the-wild. In contrast, we aim to reconstruct dynamic human behaviors, such as repairing a bike or dancing, from a small set of sparse-view cameras with complete scene coverage (e.g. four equidistant inward-facing static cameras). We find that dense multi-view reconstruction methods struggle to adapt to this sparse-view setup due to limited overlap between viewpoints. To address these limitations, we carefully align independent monocular reconstructions of each camera to produce time- and view-consistent dynamic scene reconstructions. Extensive experiments on PanopticStudio and Ego-Exo4D demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptical Coherence Tomography Applications · Advanced Vision and Imaging
