Learning a Neural Association Network for Self-supervised Multi-Object Tracking
Shuai Li, Michael Burke, Subramanian Ramamoorthy, Juergen Gall

TL;DR
This paper presents a self-supervised neural network framework for multi-object tracking that leverages a neural Kalman filter and EM algorithm to learn data association without manual annotations, achieving state-of-the-art results.
Contribution
Introduces a fully differentiable, self-supervised learning framework for multi-object tracking using a neural Kalman filter and EM algorithm, eliminating the need for identity annotations.
Findings
Achieves state-of-the-art results on MOT17, MOT20, and BDD100K datasets.
Effectively learns data association without supervision.
Outperforms existing self-supervised trackers.
Abstract
This paper introduces a novel framework to learn data association for multi-object tracking in a self-supervised manner. Fully-supervised learning methods are known to achieve excellent tracking performances, but acquiring identity-level annotations is tedious and time-consuming. Motivated by the fact that in real-world scenarios object motion can be usually represented by a Markov process, we present a novel expectation maximization (EM) algorithm that trains a neural network to associate detections for tracking, without requiring prior knowledge of their temporal correspondences. At the core of our method lies a neural Kalman filter, with an observation model conditioned on associations of detections parameterized by a neural network. Given a batch of frames as input, data associations between detections from adjacent frames are predicted by a neural network followed by a Sinkhorn…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods
