WildRayZer: Self-supervised Large View Synthesis in Dynamic Environments

Xuweiyi Chen; Wentao Zhou; Zezhou Cheng

arXiv:2601.10716·cs.CV·January 16, 2026

WildRayZer: Self-supervised Large View Synthesis in Dynamic Environments

Xuweiyi Chen, Wentao Zhou, Zezhou Cheng

PDF

Open Access 1 Datasets

TL;DR

WildRayZer is a self-supervised framework for novel view synthesis in dynamic environments, effectively handling moving objects and cameras by analyzing residuals to focus on background reconstruction, and is validated on a new large-scale real-world dataset.

Contribution

It introduces WildRayZer, a novel self-supervised approach for dynamic scene view synthesis that leverages residual analysis and pseudo motion masks, along with a new dataset for evaluation.

Findings

01

Outperforms existing methods in transient-region removal

02

Achieves higher quality full-frame NVS with a single feed-forward pass

03

Demonstrates robustness in dynamic, real-world scenarios

Abstract

We present WildRayZer, a self-supervised framework for novel view synthesis (NVS) in dynamic environments where both the camera and objects move. Dynamic content breaks the multi-view consistency that static NVS models rely on, leading to ghosting, hallucinated geometry, and unstable pose estimation. WildRayZer addresses this by performing an analysis-by-synthesis test: a camera-only static renderer explains rigid structure, and its residuals reveal transient regions. From these residuals, we construct pseudo motion masks, distill a motion estimator, and use it to mask input tokens and gate loss gradients so supervision focuses on cross-view background completion. To enable large-scale training and evaluation, we curate Dynamic RealEstate10K (D-RE10K), a real-world dataset of 15K casually captured dynamic sequences, and D-RE10K-iPhone, a paired transient and clean benchmark for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

uva-cv-lab/Dynamic-RE10K
dataset· 505 dl
505 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · 3D Shape Modeling and Analysis