Landscape-Awareness for Geometric View Diffusion Model

Yan-Ting Chen; Hao-Wei Chen; Tsu-Ching Hsiao; and Chun-Yi Lee

arXiv:2605.19865·cs.CV·May 20, 2026

Landscape-Awareness for Geometric View Diffusion Model

Yan-Ting Chen, Hao-Wei Chen, Tsu-Ching Hsiao, and Chun-Yi Lee

PDF

TL;DR

This paper introduces a score-based approach to improve camera viewpoint estimation from sparse views by reshaping the optimization landscape, leading to better convergence and accuracy.

Contribution

It proposes a novel score-based method that addresses geometric ambiguities and enhances the efficiency of viewpoint estimation in diffusion model frameworks.

Findings

01

Improved convergence and accuracy in viewpoint estimation.

02

Reduced reliance on brute-force sampling.

03

Enhanced sample-efficiency in geometric view diffusion models.

Abstract

Accurate camera viewpoint estimation under sparse-view conditions remains challenging, particularly in two-view scenarios. Recent approaches leverage diffusion models such as Zero123 to synthesize novel views conditioned on relative viewpoint, showing promising results when repurposed for viewpoint estimation via optimization with MSE loss. However, existing methods often suffer from nonconvex loss landscape with numerous local minima, making them sensitive to initialization and reliant on naive multistart strategies. We analyze these optimization challenges and visualize failure cases, showing that geometric ambiguities, such as symmetry and self-similarity, can mislead gradient-based updates toward incorrect viewpoints. To address these limitations, we propose a score-based method that reshapes the optimization landscape to guide updates toward the ground-truth viewpoint, followed by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.