VGGT-360: Geometry-Consistent Zero-Shot Panoramic Depth Estimation

Jiayi Yuan; Haobo Jiang; De Wen Soh; Na Zhao

arXiv:2603.18943·cs.CV·May 15, 2026

VGGT-360: Geometry-Consistent Zero-Shot Panoramic Depth Estimation

Jiayi Yuan, Haobo Jiang, De Wen Soh, Na Zhao

PDF

TL;DR

VGGT-360 is a training-free, geometry-consistent framework for panoramic depth estimation that unifies multi-view 3D reconstruction with robust modules for improved accuracy.

Contribution

It introduces a novel training-free approach reformulating panoramic depth estimation as panoramic reprojection over 3D models, integrating uncertainty, structure-saliency, and correlation correction modules.

Findings

01

VGGT-360 outperforms state-of-the-art methods across multiple datasets.

02

The framework effectively bridges domain gaps with uncertainty-guided projections.

03

Enhanced robustness and accuracy are achieved through structure-aware attention and 3D model correction.

Abstract

This paper presents VGGT-360, a novel training-free framework for zero-shot, geometry-consistent panoramic depth estimation. Unlike prior view-independent training-free approaches, VGGT-360 reformulates the task as panoramic reprojection over multi-view reconstructed 3D models by leveraging the intrinsic 3D consistency of VGGT-like foundation models, thereby unifying fragmented per-view reasoning into a coherent panoramic understanding. To achieve robust and accurate estimation, VGGT-360 integrates three plug-and-play modules that form a unified panorama-to-3D-to-depth framework: (i) Uncertainty-guided adaptive projection slices panoramas into perspective views to bridge the domain gap between panoramic inputs and VGGT's perspective prior. It estimates gradient-based uncertainty to allocate denser views to geometry-poor regions, yielding geometry-informative inputs for VGGT. (ii)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.