CoWTracker: Tracking by Warping instead of Correlation
Zihang Lai, Eldar Insafutdinov, Edgar Sucar, Andrea Vedaldi

TL;DR
CoWTracker introduces a warping-based dense point tracking method that avoids costly cost volume computations, achieving state-of-the-art results in tracking and optical flow by leveraging iterative warping and transformer-based reasoning.
Contribution
The paper presents CoWTracker, a novel approach that replaces cost volume matching with iterative warping and transformer reasoning, unifying dense point tracking and optical flow estimation.
Findings
Achieves state-of-the-art performance on dense point tracking benchmarks.
Outperforms specialized optical flow methods on multiple datasets.
Simplifies dense tracking and optical flow tasks with a unified warping-based architecture.
Abstract
Dense point tracking is a fundamental problem in computer vision, with applications ranging from video analysis to robotic manipulation. State-of-the-art trackers typically rely on cost volumes to match features across frames, but this approach incurs quadratic complexity in spatial resolution, limiting scalability and efficiency. In this paper, we propose \method, a novel dense point tracker that eschews cost volumes in favor of warping. Inspired by recent advances in optical flow, our approach iteratively refines track estimates by warping features from the target frame to the query frame based on the current estimate. Combined with a transformer architecture that performs joint spatiotemporal reasoning across all tracks, our design establishes long-range correspondences without computing feature correlations. Our model is simple and achieves state-of-the-art performance on standard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Video Surveillance and Tracking Methods
