Recollection from Pensieve: Novel View Synthesis via Learning from Uncalibrated Videos

Ruoyu Wang; Yi Ma; Shenghua Gao

arXiv:2505.13440·cs.CV·May 20, 2025

Recollection from Pensieve: Novel View Synthesis via Learning from Uncalibrated Videos

Ruoyu Wang, Yi Ma, Shenghua Gao

PDF

Open Access 1 Repo

TL;DR

This paper introduces a two-stage self-supervised approach for novel view synthesis from uncalibrated videos, combining implicit scene learning with explicit 3D primitive prediction to achieve high-quality results without prior geometric information.

Contribution

It proposes a novel two-stage training strategy that enables view synthesis from raw uncalibrated videos without geometric priors, bridging the gap between implicit and explicit 3D representations.

Findings

01

Achieves high-quality novel view synthesis without camera calibration.

02

Provides accurate camera pose estimation from uncalibrated videos.

03

Demonstrates effectiveness on large-scale uncalibrated video datasets.

Abstract

Currently almost all state-of-the-art novel view synthesis and reconstruction models rely on calibrated cameras or additional geometric priors for training. These prerequisites significantly limit their applicability to massive uncalibrated data. To alleviate this requirement and unlock the potential for self-supervised training on large-scale uncalibrated videos, we propose a novel two-stage strategy to train a view synthesis model from only raw video frames or multi-view images, without providing camera parameters or other priors. In the first stage, we learn to reconstruct the scene implicitly in a latent space without relying on any explicit 3D representation. Specifically, we predict per-frame latent camera and scene context features, and employ a view synthesis model as a proxy for explicit rendering. This pretraining stage substantially reduces the optimization complexity and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dwawayu/pensieve
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Advanced Vision and Imaging

MethodsALIGN