DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear Prediction
Weiyi Lv, Yuhang Huang, Ning Zhang, Ruei-Sung Lin, Mei Han, and Dan Zeng

TL;DR
DiffMOT introduces a real-time diffusion-based approach for multiple object tracking that effectively models complex non-linear motions, outperforming existing methods on challenging datasets.
Contribution
It is the first to apply a diffusion probabilistic model to MOT, with a novel decoupled diffusion-based motion predictor for non-linear motion prediction.
Findings
Real-time at 22.7FPS.
Outperforms state-of-the-art on DanceTrack and SportsMOT datasets.
Achieves 62.3% and 76.2% in HOTA metrics, respectively.
Abstract
In Multiple Object Tracking, objects often exhibit non-linear motion of acceleration and deceleration, with irregular direction changes. Tacking-by-detection (TBD) trackers with Kalman Filter motion prediction work well in pedestrian-dominant scenarios but fall short in complex situations when multiple objects perform non-linear and diverse motion simultaneously. To tackle the complex non-linear motion, we propose a real-time diffusion-based MOT approach named DiffMOT. Specifically, for the motion predictor component, we propose a novel Decoupled Diffusion-based Motion Predictor (DMP). It models the entire distribution of various motion presented by the data as a whole. It also predicts an individual object's motion conditioning on an individual's historical motion information. Furthermore, it optimizes the diffusion process with much fewer sampling steps. As a MOT tracker, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Infrared Target Detection Methodologies
MethodsDiffusion
