TL;DR
MoCam introduces a diffusion-based method that sequentially integrates geometric and appearance priors for improved novel view synthesis, especially in challenging scenarios with incomplete geometry.
Contribution
It proposes a structured denoising dynamic approach that unifies static and dynamic view synthesis by decoupling geometry and appearance refinement over diffusion stages.
Findings
MoCam outperforms prior methods on challenging datasets with geometric holes.
It achieves robust geometry-appearance disentanglement.
The method effectively refines details by switching from geometry to appearance priors.
Abstract
Generative novel view synthesis faces a fundamental dilemma: geometric priors provide spatial alignment but become sparse and inaccurate under view changes, while appearance priors offer visual fidelity but lack geometric correspondence. Existing methods either propagate geometric errors throughout generation or suffer from signal conflicts when fusing both statically. We introduce MoCam, which employs structured denoising dynamics to orchestrate a coordinated progression from geometry to appearance within the diffusion process. MoCam first leverages geometric priors in early stages to anchor coarse structures and tolerate their incompleteness, then switches to appearance priors in later stages to actively correct geometric errors and refine details. This design naturally unifies static and dynamic view synthesis by temporally decoupling geometric alignment and appearance refinement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
