TAPVid-360: Tracking Any Point in 360 from Narrow Field of View Video

Finlay G.C. Hudson; James A.D. Gardner; William A.P. Smith

arXiv:2511.21946·cs.CV·December 16, 2025

TAPVid-360: Tracking Any Point in 360 from Narrow Field of View Video

Finlay G.C. Hudson, James A.D. Gardner, William A.P. Smith

PDF

Open Access 1 Video

TL;DR

TAPVid-360 introduces a new task and dataset for tracking scene points outside the field of view in 360 videos, enabling panoramic scene understanding from narrow FOV videos.

Contribution

The paper proposes TAPVid-360, a novel task and dataset for predicting directions to scene points outside the FOV, and adapts existing models to this new challenge.

Findings

01

Baseline outperforms existing methods on the new benchmark

02

360 videos provide effective supervision for allocentric scene understanding

03

The approach enables tracking points beyond the visible field of view

Abstract

Humans excel at constructing panoramic mental models of their surroundings, maintaining object permanence and inferring scene structure beyond visible regions. In contrast, current artificial vision systems struggle with persistent, panoramic understanding, often processing scenes egocentrically on a frame-by-frame basis. This limitation is pronounced in the Track Any Point (TAP) task, where existing methods fail to track 2D points outside the field of view. To address this, we introduce TAPVid-360, a novel task that requires predicting the 3D direction to queried scene points across a video sequence, even when far outside the narrow field of view of the observed video. This task fosters learning allocentric scene representations without needing dynamic 4D ground truth scene models for training. Instead, we exploit 360 videos as a source of supervision, resampling them into narrow…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

TAPVid-360: Tracking Any Point in 360 from Narrow Field of View Video· slideslive

Taxonomy

TopicsAdvanced Vision and Imaging · Face recognition and analysis · Human Pose and Action Recognition