Accurate Planar Tracking With Robust Re-Detection
Jonas Serych, Jiri Matas

TL;DR
This paper introduces SAM-H and WOFTSAM, innovative planar trackers that combine segmentation-based robustness with homography estimation, achieving state-of-the-art results on multiple benchmarks and improving re-detection capabilities.
Contribution
The paper proposes SAM-H and WOFTSAM, novel planar tracking methods that integrate segmentation and re-detection, significantly enhancing robustness and accuracy over previous approaches.
Findings
Achieved new state-of-the-art performance on POT-210 and PlanarTrack benchmarks.
Outperformed previous methods by +12.4 and +15.2 percentage points on p@15 metric.
Provided improved ground-truth annotations for more accurate benchmarking.
Abstract
We present SAM-H and WOFTSAM, novel planar trackers that combine robust long-term segmentation tracking provided by SAM 2 with 8 degrees-of-freedom homography pose estimation. SAM-H estimates homographies from segmentation mask contours and is thus highly robust to target appearance changes. WOFTSAM significantly improves the current state-of-the-art planar tracker WOFT by exploiting lost target re-detection provided by SAM-H. The proposed methods are evaluated on POT-210 and PlanarTrack tracking benchmarks, setting the new state-of-the-art performance on both. On the latter, they outperform the second best by a large margin, +12.4 and +15.2pp on the p@15 metric. We also present improved ground-truth annotations of initial PlanarTrack poses, enabling more accurate benchmarking in the high-precision p@5 metric. The code and the re-annotations are available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques
