VGGT-360: Geometry-Consistent Zero-Shot Panoramic Depth Estimation
Jiayi Yuan, Haobo Jiang, De Wen Soh, Na Zhao

TL;DR
VGGT-360 is a training-free, geometry-consistent framework for panoramic depth estimation that unifies multi-view 3D reconstruction with robust modules for improved accuracy.
Contribution
It introduces a novel training-free approach reformulating panoramic depth estimation as panoramic reprojection over 3D models, integrating uncertainty, structure-saliency, and correlation correction modules.
Findings
VGGT-360 outperforms state-of-the-art methods across multiple datasets.
The framework effectively bridges domain gaps with uncertainty-guided projections.
Enhanced robustness and accuracy are achieved through structure-aware attention and 3D model correction.
Abstract
This paper presents VGGT-360, a novel training-free framework for zero-shot, geometry-consistent panoramic depth estimation. Unlike prior view-independent training-free approaches, VGGT-360 reformulates the task as panoramic reprojection over multi-view reconstructed 3D models by leveraging the intrinsic 3D consistency of VGGT-like foundation models, thereby unifying fragmented per-view reasoning into a coherent panoramic understanding. To achieve robust and accurate estimation, VGGT-360 integrates three plug-and-play modules that form a unified panorama-to-3D-to-depth framework: (i) Uncertainty-guided adaptive projection slices panoramas into perspective views to bridge the domain gap between panoramic inputs and VGGT's perspective prior. It estimates gradient-based uncertainty to allocate denser views to geometry-poor regions, yielding geometry-informative inputs for VGGT. (ii)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
