Binary Quadratic Programing for Online Tracking of Hundreds of People in Extremely Crowded Scenes
Afshin Dehghan, Mubarak Shah

TL;DR
This paper introduces a novel online multi-object tracking method for extremely crowded scenes by formulating the problem as a Binary Quadratic Program and solving it efficiently with an advanced optimization algorithm, enabling real-time tracking of hundreds of targets.
Contribution
It presents a new quadratic programming formulation for crowd tracking that integrates appearance, motion, and contextual cues, along with an efficient solver for high-density scenarios.
Findings
Successfully tracks hundreds of targets in crowded scenes.
Achieves significant improvements over state-of-the-art methods.
Operates efficiently in real-time conditions.
Abstract
Multi-object tracking has been studied for decades. However, when it comes to tracking pedestrians in extremely crowded scenes, we are limited to only few works. This is an important problem which gives rise to several challenges. Pre-trained object detectors fail to localize targets in crowded sequences. This consequently limits the use of data-association based multi-target tracking methods which rely on the outcome of an object detector. Additionally, the small apparent target size makes it challenging to extract features to discriminate targets from their surroundings. Finally, the large number of targets greatly increases computational complexity which in turn makes it hard to extend existing multi-target tracking approaches to high-density crowd scenarios. In this paper, we propose a tracker that addresses the aforementioned problems and is capable of tracking hundreds of people…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
