WildRayZer: Self-supervised Large View Synthesis in Dynamic Environments
Xuweiyi Chen, Wentao Zhou, Zezhou Cheng

TL;DR
WildRayZer is a self-supervised framework for novel view synthesis in dynamic environments, effectively handling moving objects and cameras by analyzing residuals to focus on background reconstruction, and is validated on a new large-scale real-world dataset.
Contribution
It introduces WildRayZer, a novel self-supervised approach for dynamic scene view synthesis that leverages residual analysis and pseudo motion masks, along with a new dataset for evaluation.
Findings
Outperforms existing methods in transient-region removal
Achieves higher quality full-frame NVS with a single feed-forward pass
Demonstrates robustness in dynamic, real-world scenarios
Abstract
We present WildRayZer, a self-supervised framework for novel view synthesis (NVS) in dynamic environments where both the camera and objects move. Dynamic content breaks the multi-view consistency that static NVS models rely on, leading to ghosting, hallucinated geometry, and unstable pose estimation. WildRayZer addresses this by performing an analysis-by-synthesis test: a camera-only static renderer explains rigid structure, and its residuals reveal transient regions. From these residuals, we construct pseudo motion masks, distill a motion estimator, and use it to mask input tokens and gate loss gradients so supervision focuses on cross-view background completion. To enable large-scale training and evaluation, we curate Dynamic RealEstate10K (D-RE10K), a real-world dataset of 15K casually captured dynamic sequences, and D-RE10K-iPhone, a paired transient and clean benchmark for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · 3D Shape Modeling and Analysis
