Architecture and evaluation protocol for transformer-based visual object tracking in UAV applications
Augustin Borne (ISL, Hochschule Karlsruhe -- Technik und Wirtschaft Karlsruhe University of Applied Sciences, IRIMAS), Pierre Notin (ISL), Christophe Hennequin (ISL), Sebastien Changey (ISL), Stephane Bazeille (IRIMAS), Christophe Cudel (IRIMAS), Franz Quint

TL;DR
This paper introduces MATA, a modular transformer-based UAV tracking system with an embedded evaluation protocol and NT2F metric, demonstrating improved robustness and real-time performance on embedded hardware.
Contribution
It presents a novel modular architecture combining transformer tracking with Kalman filtering and ego-motion compensation, along with a hardware-independent evaluation protocol and NT2F metric.
Findings
MATA outperforms existing trackers on UAV benchmarks.
The NT2F metric effectively measures tracker longevity.
Embedded implementation confirms real-time suitability.
Abstract
Object tracking from Unmanned Aerial Vehicles (UAVs) is challenged by platform dynamics, camera motion, and limited onboard resources. Existing visual trackers either lack robustness in complex scenarios or are too computationally demanding for real-time embedded use. We propose an Modular Asynchronous Tracking Architecture (MATA) that combines a transformer-based tracker with an Extended Kalman Filter, integrating ego-motion compensation from sparse optical flow and an object trajectory model. We further introduce a hardware-independent, embedded oriented evaluation protocol and a new metric called Normalized time to Failure (NT2F) to quantify how long a tracker can sustain a tracking sequence without external help. Experiments on UAV benchmarks, including an augmented UAV123 dataset with synthetic occlusions, show consistent improvements in Success and NT2F metrics across multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · UAV Applications and Optimization · Target Tracking and Data Fusion in Sensor Networks
