Synthetic data generation for end-to-end thermal infrared tracking
Lichao Zhang, Abel Gonzalez-Garcia, Joost van de Weijer, Martin, Danelljan, and Fahad Shahbaz Khan

TL;DR
This paper introduces a method to generate synthetic thermal infrared data from RGB images using image translation models, enabling training of end-to-end TIR trackers and significantly improving tracking performance.
Contribution
It is the first to train end-to-end features for TIR tracking using synthetic data generated via image translation models, enhancing tracking accuracy.
Findings
Synthetic TIR data improves tracker performance.
Combining synthetic and real TIR data yields better results.
Over 10% relative gain over state-of-the-art methods.
Abstract
The usage of both off-the-shelf and end-to-end trained deep networks have significantly improved performance of visual tracking on RGB videos. However, the lack of large labeled datasets hampers the usage of convolutional neural networks for tracking in thermal infrared (TIR) images. Therefore, most state of the art methods on tracking for TIR data are still based on handcrafted features. To address this problem, we propose to use image-to-image translation models. These models allow us to translate the abundantly available labeled RGB data to synthetic TIR data. We explore both the usage of paired and unpaired image translation models for this purpose. These methods provide us with a large labeled dataset of synthetic TIR sequences, on which we can train end-to-end optimal features for tracking. To the best of our knowledge we are the first to train end-to-end features for TIR…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
