Is This Tracker On? A Benchmark Protocol for Dynamic Tracking
Ilona Demler, Saumya Chauhan, Georgia Gkioxari

TL;DR
This paper introduces ITTO, a comprehensive benchmark suite for evaluating point tracking methods in complex, real-world scenarios, revealing current limitations and guiding future improvements.
Contribution
The paper presents ITTO, a new benchmark with diverse, challenging videos and detailed annotations, enabling rigorous assessment of tracking algorithms under realistic conditions.
Findings
Existing trackers struggle with occlusion re-identification.
Performance drops significantly with motion complexity.
ITTO highlights key failure modes in current tracking methods.
Abstract
We introduce ITTO, a challenging new benchmark suite for evaluating and diagnosing the capabilities and limitations of point tracking methods. Our videos are sourced from existing datasets and egocentric real-world recordings, with high-quality human annotations collected through a multi-stage pipeline. ITTO captures the motion complexity, occlusion patterns, and object diversity characteristic of real-world scenes -- factors that are largely absent in current benchmarks. We conduct a rigorous analysis of state-of-the-art tracking methods on ITTO, breaking down performance along key axes of motion complexity. Our findings reveal that existing trackers struggle with these challenges, particularly in re-identifying points after occlusion, highlighting critical failure modes. These results point to the need for new modeling approaches tailored to real-world dynamics. We envision ITTO as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Social Robot Interaction and HRI
