TAPVid-3D: A Benchmark for Tracking Any Point in 3D

Skanda Koppula; Ignacio Rocco; Yi Yang; Joe Heyward; Jo\~ao Carreira,; Andrew Zisserman; Gabriel Brostow; Carl Doersch

arXiv:2407.05921·cs.CV·August 28, 2024

TAPVid-3D: A Benchmark for Tracking Any Point in 3D

Skanda Koppula, Ignacio Rocco, Yi Yang, Joe Heyward, Jo\~ao Carreira,, Andrew Zisserman, Gabriel Brostow, Carl Doersch

PDF

Open Access 2 Repos 1 Video

TL;DR

TAPVid-3D introduces a comprehensive benchmark with over 4,000 real-world videos for evaluating long-range 3D point tracking, addressing the lack of existing datasets and metrics for this task.

Contribution

The paper presents the first large-scale 3D point tracking benchmark, including new metrics and baseline models, to advance understanding of 3D motion and surface deformation from monocular videos.

Findings

01

Benchmark with 4,000+ videos across diverse environments

02

Extended metrics for 3D point tracking accuracy

03

Baseline models demonstrate current capabilities and challenges

Abstract

We introduce a new benchmark, TAPVid-3D, for evaluating the task of long-range Tracking Any Point in 3D (TAP-3D). While point tracking in two dimensions (TAP) has many benchmarks measuring performance on real-world videos, such as TAPVid-DAVIS, three-dimensional point tracking has none. To this end, leveraging existing footage, we build a new benchmark for 3D point tracking featuring 4,000+ real-world videos, composed of three different data sources spanning a variety of object types, motion patterns, and indoor and outdoor environments. To measure performance on the TAP-3D task, we formulate a collection of metrics that extend the Jaccard-based metric used in TAP to handle the complexities of ambiguous depth scales across models, occlusions, and multi-track spatio-temporal smoothness. We manually verify a large sample of trajectories to ensure correct video annotations, and assess the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

TAPVid-3D: A Benchmark for Tracking Any Point in 3D· slideslive

Taxonomy

TopicsCOVID-19 diagnosis using AI · Cell Image Analysis Techniques