Simple Cues Lead to a Strong Multi-Object Tracker

Jenny Seidenschwarz; Guillem Bras\'o; Victor Castro Serrano; Ismail; Elezi; and Laura Leal-Taix\'e

arXiv:2206.04656·cs.CV·April 27, 2023·6 cites

Simple Cues Lead to a Strong Multi-Object Tracker

Jenny Seidenschwarz, Guillem Bras\'o, Victor Castro Serrano, Ismail, Elezi, and Laura Leal-Taix\'e

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that simple tracking-by-detection methods, when combined with basic appearance features and motion models, can achieve state-of-the-art multi-object tracking performance across multiple datasets.

Contribution

The authors show that a standard re-identification network combined with simple motion cues can match complex end-to-end models in multi-object tracking.

Findings

01

Achieves state-of-the-art results on MOT17, MOT20, BDD100k, and DanceTrack datasets.

02

Simple cues combined with a re-identification network are highly effective.

03

Analysis of failure cases provides insights for further improvements.

Abstract

For a long time, the most common paradigm in Multi-Object Tracking was tracking-by-detection (TbD), where objects are first detected and then associated over video frames. For association, most models resourced to motion and appearance cues, e.g., re-identification networks. Recent approaches based on attention propose to learn the cues in a data-driven manner, showing impressive results. In this paper, we ask ourselves whether simple good old TbD methods are also capable of achieving the performance of end-to-end models. To this end, we propose two key ingredients that allow a standard re-identification network to excel at appearance-based tracking. We extensively analyse its failure cases, and show that a combination of our appearance features with a simple motion model leads to strong tracking results. Our tracker generalizes to four public datasets, namely MOT17, MOT20, BDD100k, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dvl-tum/ghost
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Face recognition and analysis · Human Pose and Action Recognition