Improving Accuracy and Generalization for Efficient Visual Tracking
Ram Zaveri, Shivang Patel, Yu Gu, Gianfranco Doretto

TL;DR
This paper introduces SiamABC, an efficient Siamese visual tracker that enhances accuracy and generalization, especially on out-of-distribution sequences, through novel architecture, training losses, and test-time adaptation, achieving state-of-the-art results.
Contribution
SiamABC combines new architectural designs, training losses, and a fast test-time adaptation method to improve out-of-distribution generalization in visual tracking.
Findings
SiamABC outperforms MixFormerV2-S by 7.6% on OOD AVisT benchmark.
SiamABC maintains high accuracy on in-distribution benchmarks.
SiamABC runs at 100 FPS on CPU, three times faster than comparable methods.
Abstract
Efficient visual trackers overfit to their training distributions and lack generalization abilities, resulting in them performing well on their respective in-distribution (ID) test sets and not as well on out-of-distribution (OOD) sequences, imposing limitations to their deployment in-the-wild under constrained resources. We introduce SiamABC, a highly efficient Siamese tracker that significantly improves tracking performance, even on OOD sequences. SiamABC takes advantage of new architectural designs in the way it bridges the dynamic variability of the target, and of new losses for training. Also, it directly addresses OOD tracking generalization by including a fast backward-free dynamic test-time adaptation method that continuously adapts the model according to the dynamic visual changes of the target. Our extensive experiments suggest that SiamABC shows remarkable performance gains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods
