World Reconstruction From Inconsistent Views

Lukas H\"ollein; Matthias Nie{\ss}ner

arXiv:2603.16736·cs.CV·March 19, 2026

World Reconstruction From Inconsistent Views

Lukas H\"ollein, Matthias Nie{\ss}ner

PDF

Open Access

TL;DR

This paper introduces a method to reconstruct consistent 3D worlds from videos with inconsistent frames by aligning and optimizing pointclouds, enabling high-quality 3D environment generation from video diffusion models.

Contribution

The paper presents a novel non-rigid alignment and global optimization approach for 3D reconstruction from inconsistent video frames, improving 3D scene quality and consistency.

Findings

01

Higher quality 3D reconstructions than baselines

02

Effective alignment of inconsistent video frames

03

Generation of explorable 3D environments

Abstract

Video diffusion models generate high-quality and diverse worlds; however, individual frames often lack 3D consistency across the output sequence, which makes the reconstruction of 3D worlds difficult. To this end, we propose a new method that handles these inconsistencies by non-rigidly aligning the video frames into a globally-consistent coordinate frame that produces sharp and detailed pointcloud reconstructions. First, a geometric foundation model lifts each frame into a pixel-wise 3D pointcloud, which contains unaligned surfaces due to these inconsistencies. We then propose a tailored non-rigid iterative frame-to-model ICP to obtain an initial alignment across all frames, followed by a global optimization that further sharpens the pointcloud. Finally, we leverage this pointcloud as initialization for 3D reconstruction and propose a novel inverse deformation rendering loss to create…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis