Leader360V: The Large-scale, Real-world 360 Video Dataset for Multi-task Learning in Diverse Environment

Weiming Zhang; Dingwen Xiao; Aobotao Dai; Yexin Liu; Tianbo Pan; Shiqi Wen; Lei Chen; Lin Wang

arXiv:2506.14271·cs.CV·June 18, 2025

Leader360V: The Large-scale, Real-world 360 Video Dataset for Multi-task Learning in Diverse Environment

Weiming Zhang, Dingwen Xiao, Aobotao Dai, Yexin Liu, Tianbo Pan, Shiqi Wen, Lei Chen, Lin Wang

PDF

Open Access

TL;DR

Leader360V provides the first large-scale, real-world 360 video dataset with automated annotation pipeline, enabling improved multi-task learning for scene understanding in diverse environments.

Contribution

It introduces a novel large-scale 360 video dataset with an automated, multi-stage annotation pipeline utilizing pre-trained models and LLMs, addressing annotation challenges in spherical videos.

Findings

01

Dataset enhances model performance in 360 segmentation and tracking

02

Automated labeling pipeline reduces annotation cost and complexity

03

High scene diversity supports robust multi-task learning

Abstract

360 video captures the complete surrounding scenes with the ultra-large field of view of 360X180. This makes 360 scene understanding tasks, eg, segmentation and tracking, crucial for appications, such as autonomous driving, robotics. With the recent emergence of foundation models, the community is, however, impeded by the lack of large-scale, labelled real-world datasets. This is caused by the inherent spherical properties, eg, severe distortion in polar regions, and content discontinuities, rendering the annotation costly yet complex. This paper introduces Leader360V, the first large-scale, labeled real-world 360 video datasets for instance segmentation and tracking. Our datasets enjoy high scene diversity, ranging from indoor and urban settings to natural and dynamic outdoor scenes. To automate annotation, we design an automatic labeling pipeline, which subtly coordinates pre-trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · COVID-19 diagnosis using AI · Video Surveillance and Tracking Methods