TL;DR
This paper introduces a real-time multi-person tracking method that combines detection and tracking outputs, using deep learning for candidate scoring and person re-identification to improve accuracy and robustness.
Contribution
It proposes a novel candidate selection framework with a CNN-based scoring function and a deeply learned appearance model for enhanced multi-person tracking.
Findings
Achieves real-time performance on benchmark datasets.
Outperforms existing methods in accuracy and robustness.
Effectively handles occlusion and noisy detections.
Abstract
Online multi-object tracking is a fundamental problem in time-critical video analysis applications. A major challenge in the popular tracking-by-detection framework is how to associate unreliable detection results with existing tracks. In this paper, we propose to handle unreliable detection by collecting candidates from outputs of both detection and tracking. The intuition behind generating redundant candidates is that detection and tracks can complement each other in different scenarios. Detection results of high confidence prevent tracking drifts in the long term, and predictions of tracks can handle noisy detection caused by occlusion. In order to apply optimal selection from a considerable amount of candidates in real-time, we present a novel scoring function based on a fully convolutional neural network, that shares most computations on the entire image. Moreover, we adopt a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
