PanoVGGT: Feed-Forward 3D Reconstruction from Panoramic Imagery
Yijing Guo, Mengjun Chao, Luo Wang, Tianyang Zhao, Haizhao Dai, Yingliang Zhang, Jingyi Yu, Yujiao Shi

TL;DR
PanoVGGT is a novel Transformer-based model that jointly estimates camera poses, depth, and 3D structure from panoramic images, addressing distortions and improving generalization in 3D reconstruction tasks.
Contribution
It introduces a spherical-aware Transformer framework with new augmentation and training strategies, along with a large-scale panoramic dataset for outdoor scenes.
Findings
Achieves competitive accuracy on benchmarks
Demonstrates robustness and cross-domain generalization
Provides a new panoramic dataset with dense annotations
Abstract
Panoramic imagery offers a full 360{\deg} field of view and is increasingly common in consumer devices. However, it introduces non-pinhole distortions that challenge joint pose estimation and 3D reconstruction. Existing feed-forward models, built for perspective cameras, generalize poorly to this setting. We propose PanoVGGT, a permutation-equivariant Transformer framework that jointly predicts camera poses, depth maps, and 3D point clouds from one or multiple panoramas in a single forward pass. The model incorporates spherical-aware positional embeddings and a panorama-specific three-axis SO(3) rotation augmentation, enabling effective geometric reasoning in the spherical domain. To resolve inherent global-frame ambiguity, we further introduce a stochastic anchoring strategy during training. In addition, we contribute PanoCity, a large-scale outdoor panoramic dataset with dense depth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques
