End-to-end Recurrent Multi-Object Tracking and Trajectory Prediction   with Relational Reasoning

Fabian B. Fuchs; Adam R. Kosiorek; Li Sun; Oiwi Parker Jones; Ingmar; Posner

arXiv:1907.12887·cs.CV·September 29, 2020·5 cites

End-to-end Recurrent Multi-Object Tracking and Trajectory Prediction with Relational Reasoning

Fabian B. Fuchs, Adam R. Kosiorek, Li Sun, Oiwi Parker Jones, Ingmar, Posner

PDF

Open Access

TL;DR

This paper introduces MOHART, an end-to-end, class-agnostic multi-object tracking system with relational reasoning that models object interactions, improving tracking and trajectory prediction in complex scenes.

Contribution

It presents a novel multi-object tracking method that incorporates relational reasoning using permutation-invariant architectures and multi-headed attention, outperforming simpler models.

Findings

01

Relational reasoning improves tracking accuracy.

02

Multi-headed attention architecture outperforms DeepSets.

03

Modeling interactions enhances trajectory prediction in real-world datasets.

Abstract

The majority of contemporary object-tracking approaches do not model interactions between objects. This contrasts with the fact that objects' paths are not independent: a cyclist might abruptly deviate from a previously planned trajectory in order to avoid colliding with a car. Building upon HART, a neural class-agnostic single-object tracker, we introduce a multi-object tracking method MOHART capable of relational reasoning. Importantly, the entire system, including the understanding of interactions and relations between objects, is class-agnostic and learned simultaneously in an end-to-end fashion. We explore a number of relational reasoning architectures and show that permutation-invariant models outperform non-permutation-invariant alternatives. We also find that architectures using a single permutation invariant operation like DeepSets, despite, in theory, being universal function…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Autonomous Vehicle Technology and Safety · Human Pose and Action Recognition