World Reconstruction From Inconsistent Views
Lukas H\"ollein, Matthias Nie{\ss}ner

TL;DR
This paper introduces a method to reconstruct consistent 3D worlds from videos with inconsistent frames by aligning and optimizing pointclouds, enabling high-quality 3D environment generation from video diffusion models.
Contribution
The paper presents a novel non-rigid alignment and global optimization approach for 3D reconstruction from inconsistent video frames, improving 3D scene quality and consistency.
Findings
Higher quality 3D reconstructions than baselines
Effective alignment of inconsistent video frames
Generation of explorable 3D environments
Abstract
Video diffusion models generate high-quality and diverse worlds; however, individual frames often lack 3D consistency across the output sequence, which makes the reconstruction of 3D worlds difficult. To this end, we propose a new method that handles these inconsistencies by non-rigidly aligning the video frames into a globally-consistent coordinate frame that produces sharp and detailed pointcloud reconstructions. First, a geometric foundation model lifts each frame into a pixel-wise 3D pointcloud, which contains unaligned surfaces due to these inconsistencies. We then propose a tailored non-rigid iterative frame-to-model ICP to obtain an initial alignment across all frames, followed by a global optimization that further sharpens the pointcloud. Finally, we leverage this pointcloud as initialization for 3D reconstruction and propose a novel inverse deformation rendering loss to create…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis
