AnthroTAP: Learning Point Tracking with Real-World Motion

In\`es Hyeonsu Kim; Seokju Cho; Jahyeok Koo; Junghyun Park; Jiahui Huang; Honglak Lee; Joon-Young Lee; Seungryong Kim

arXiv:2507.06233·cs.CV·March 31, 2026

AnthroTAP: Learning Point Tracking with Real-World Motion

In\`es Hyeonsu Kim, Seokju Cho, Jahyeok Koo, Junghyun Park, Jiahui Huang, Honglak Lee, Joon-Young Lee, Seungryong Kim

PDF

TL;DR

AnthroTAP introduces an automated pipeline that generates large-scale pseudo-labeled point tracking data from real human videos, improving real-world generalization of tracking models.

Contribution

It presents a novel method to produce real-world training data for point tracking by fitting SMPL models to human videos, reducing reliance on expensive manual annotations.

Findings

01

Model trained on AnthroTAP surpasses state-of-the-art on TAP-Vid.

02

Outperforms recent self-training methods with less training time.

03

Structured human motion is an effective supervision source.

Abstract

Point tracking models often struggle to generalize to real-world videos because large-scale training data is predominantly synthetic $\unicode x 2014$ the only source currently feasible to produce at scale. Collecting real-world annotations, however, is prohibitively expensive, as it requires tracking hundreds of points across frames. We introduce \textbf{AnthroTAP}, an automated pipeline that generates large-scale pseudo-labeled point tracking data from real human motion videos. Leveraging the structured complexity of human movement $\unicode x 2014$ non-rigid deformations, articulated motion, and frequent occlusions $\unicode x 2014$ AnthroTAP fits Skinned Multi-Person Linear (SMPL) models to detected humans, projects mesh vertices onto image planes, resolves occlusions via ray-casting, and filters unreliable tracks using optical flow consistency. A model trained on the AnthroTAP dataset…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.