Transforming Model Prediction for Tracking
Christoph Mayer, Martin Danelljan, Goutam Bhat, Matthieu Paul, Danda, Pani Paudel, Fisher Yu, Luc Van Gool

TL;DR
This paper introduces a Transformer-based prediction module for tracking, enhancing global reasoning and model expressivity, leading to state-of-the-art performance on multiple benchmarks.
Contribution
It proposes a novel Transformer-based tracker architecture that learns more powerful target models and improves bounding box regression, trained end-to-end.
Findings
Achieves a 68.5% AUC on LaSOT dataset.
Sets new state-of-the-art results on three tracking benchmarks.
Demonstrates improved global reasoning in tracking models.
Abstract
Optimization based tracking methods have been widely successful by integrating a target model prediction module, providing effective global reasoning by minimizing an objective function. While this inductive bias integrates valuable domain knowledge, it limits the expressivity of the tracking network. In this work, we therefore propose a tracker architecture employing a Transformer-based model prediction module. Transformers capture global relations with little inductive bias, allowing it to learn the prediction of more powerful target models. We further extend the model predictor to estimate a second set of weights that are applied for accurate bounding box regression. The resulting tracker relies on training and on test frame information in order to predict all weights transductively. We train the proposed tracker end-to-end and validate its performance by conducting comprehensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Air Quality Monitoring and Forecasting · Human Mobility and Location-Based Analysis
