Local All-Pair Correspondence for Point Tracking
Seokju Cho, Jiahui Huang, Jisu Nam, Honggyu An, Seungryong Kim,, Joon-Young Lee

TL;DR
LocoTrack is a novel point tracking model that uses all-pair correlations, bidirectional matching, and a Transformer for robust, accurate, and fast point tracking across videos, outperforming existing methods.
Contribution
The paper introduces LocoTrack, a new approach leveraging all-pair correlations and a lightweight Transformer to improve accuracy and efficiency in point tracking tasks.
Findings
Achieves state-of-the-art accuracy on TAP-Vid benchmarks.
Operates nearly 6 times faster than previous methods.
Effectively handles ambiguous regions with bidirectional correspondence.
Abstract
We introduce LocoTrack, a highly accurate and efficient model designed for the task of tracking any point (TAP) across video sequences. Previous approaches in this task often rely on local 2D correlation maps to establish correspondences from a point in the query image to a local region in the target image, which often struggle with homogeneous regions or repetitive features, leading to matching ambiguities. LocoTrack overcomes this challenge with a novel approach that utilizes all-pair correspondences across regions, i.e., local 4D correlation, to establish precise correspondences, with bidirectional correspondence and matching smoothness significantly enhancing robustness against ambiguities. We also incorporate a lightweight correlation encoder to enhance computational efficiency, and a compact Transformer architecture to integrate long-term temporal information. LocoTrack achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Inertial Sensor and Navigation · Advanced Measurement and Metrology Techniques
MethodsAttention Is All You Need · Byte Pair Encoding · Layer Normalization · Label Smoothing · Linear Layer · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Multi-Head Attention · Dense Connections
