MTPano: Multi-Task Panoramic Scene Understanding via Label-Free Integration of Dense Prediction Priors

Jingdong Zhang; Xiaohang Zhan; Lingzhi Zhang; Yizhou Wang; Zhengming Yu; Jionghao Wang; Wenping Wang; Xin Li

arXiv:2602.05330·cs.CV·April 29, 2026

MTPano: Multi-Task Panoramic Scene Understanding via Label-Free Integration of Dense Prediction Priors

Jingdong Zhang, Xiaohang Zhan, Lingzhi Zhang, Yizhou Wang, Zhengming Yu, Jionghao Wang, Wenping Wang, Xin Li

PDF

1 Models

TL;DR

MTPano is a multi-task panoramic scene understanding model that uses a label-free training pipeline with pseudo-labels, disentangled feature streams, and geometry-aware modules to improve performance across diverse tasks.

Contribution

It introduces a novel label-free training pipeline and a dual-branch architecture that effectively disentangles and integrates features for multi-task panoramic scene understanding.

Findings

01

Achieves state-of-the-art results on multiple benchmarks.

02

Effectively leverages perspective dense priors for panoramic tasks.

03

Demonstrates competitive performance against task-specific models.

Abstract

Comprehensive panoramic scene understanding is critical for immersive applications, yet it remains challenging due to the scarcity of high-resolution, multi-task annotations. While perspective foundation models have achieved success through data scaling, directly adapting them to the panoramic domain often fails due to severe geometric distortions and coordinate system discrepancies. Furthermore, the underlying relations between diverse dense prediction tasks in spherical spaces are underexplored. To address these challenges, we propose MTPano, a robust multi-task panoramic foundation model established by a label-free training pipeline. First, to circumvent data scarcity, we leverage powerful perspective dense priors. We project panoramic images into perspective patches to generate accurate, domain-gap-free pseudo-labels using off-the-shelf foundation models, which are then re-projected…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
jdzhang0929/MTPano
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.