ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking
Xudong Han, Nobuyuki Oishi, Yueying Tian, Elif Ucurum and, Rupert Young, Chris Chatwin, Philip Birch

TL;DR
ETTrack introduces an enhanced motion prediction method combining transformers and TCNs, along with a novel loss function, to improve multi-object tracking in complex motion scenarios, achieving state-of-the-art results.
Contribution
The paper presents a novel motion predictor integrating transformers and TCNs, and a Momentum Correction Loss, for better handling of non-linear object movements in MOT.
Findings
Achieves 56.4% HOTA on DanceTrack
Achieves 74.4% HOTA on SportsMOT
Outperforms several state-of-the-art trackers
Abstract
Many Multi-Object Tracking (MOT) approaches exploit motion information to associate all the detected objects across frames. However, many methods that rely on filtering-based algorithms, such as the Kalman Filter, often work well in linear motion scenarios but struggle to accurately predict the locations of objects undergoing complex and non-linear movements. To tackle these scenarios, we propose a motion-based MOT approach with an enhanced temporal motion predictor, ETTrack. Specifically, the motion predictor integrates a transformer model and a Temporal Convolutional Network (TCN) to capture short-term and long-term motion patterns, and it predicts the future motion of individual objects based on the historical motion information. Additionally, we propose a novel Momentum Correction Loss function that provides additional information regarding the motion direction of objects during…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Fire Detection and Safety Systems
