End-to-end Recurrent Multi-Object Tracking and Trajectory Prediction with Relational Reasoning
Fabian B. Fuchs, Adam R. Kosiorek, Li Sun, Oiwi Parker Jones, Ingmar, Posner

TL;DR
This paper introduces MOHART, an end-to-end, class-agnostic multi-object tracking system with relational reasoning that models object interactions, improving tracking and trajectory prediction in complex scenes.
Contribution
It presents a novel multi-object tracking method that incorporates relational reasoning using permutation-invariant architectures and multi-headed attention, outperforming simpler models.
Findings
Relational reasoning improves tracking accuracy.
Multi-headed attention architecture outperforms DeepSets.
Modeling interactions enhances trajectory prediction in real-world datasets.
Abstract
The majority of contemporary object-tracking approaches do not model interactions between objects. This contrasts with the fact that objects' paths are not independent: a cyclist might abruptly deviate from a previously planned trajectory in order to avoid colliding with a car. Building upon HART, a neural class-agnostic single-object tracker, we introduce a multi-object tracking method MOHART capable of relational reasoning. Importantly, the entire system, including the understanding of interactions and relations between objects, is class-agnostic and learned simultaneously in an end-to-end fashion. We explore a number of relational reasoning architectures and show that permutation-invariant models outperform non-permutation-invariant alternatives. We also find that architectures using a single permutation invariant operation like DeepSets, despite, in theory, being universal function…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Autonomous Vehicle Technology and Safety · Human Pose and Action Recognition
