Next Generation Multitarget Trackers: Random Finite Set Methods vs Transformer-based Deep Learning
Juliano Pinto, Georg Hess, William Ljungbergh, Yuxuan Xia, Lennart, Svensson, Henk Wymeersch

TL;DR
This paper compares a novel Transformer-based deep learning approach to traditional Bayesian filters for multitarget tracking, demonstrating superior performance in complex scenarios while matching Bayesian methods in simpler cases.
Contribution
Introduces a Transformer-based deep learning method for multitarget tracking and compares it with Bayesian filters under known model conditions, highlighting its effectiveness.
Findings
Deep learning outperforms Bayesian filters in complex scenarios.
Deep learning matches Bayesian filters in simple scenarios.
The proposed method benefits from unlimited training data.
Abstract
Multitarget Tracking (MTT) is the problem of tracking the states of an unknown number of objects using noisy measurements, with important applications to autonomous driving, surveillance, robotics, and others. In the model-based Bayesian setting, there are conjugate priors that enable us to express the multi-object posterior in closed form, which could theoretically provide Bayes-optimal estimates. However, the posterior involves a super-exponential growth of the number of hypotheses over time, forcing state-of-the-art methods to resort to approximations for remaining tractable, which can impact their performance in complex scenarios. Model-free methods based on deep-learning provide an attractive alternative, as they can, in principle, learn the optimal filter from data, but to the best of our knowledge were never compared to current state-of-the-art Bayesian filters, specially not in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Gaussian Processes and Bayesian Inference · Target Tracking and Data Fusion in Sensor Networks
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Attention Is All You Need · Dropout · Byte Pair Encoding · Residual Connection · Layer Normalization · Label Smoothing · Adam
