TL;DR
This paper introduces multiple object forecasting (MOF), a new task predicting full object bounding boxes in diverse environments, supported by a large dataset and a novel model that outperforms existing methods.
Contribution
It formulates MOF as a new problem, introduces the Citywalks dataset, and proposes STED, a novel architecture that effectively models object and ego-motion.
Findings
STED outperforms existing approaches on MOF tasks.
Citywalks dataset covers diverse environments and conditions.
Cross-dataset generalization demonstrated on MOT-17.
Abstract
This paper introduces the problem of multiple object forecasting (MOF), in which the goal is to predict future bounding boxes of tracked objects. In contrast to existing works on object trajectory forecasting which primarily consider the problem from a birds-eye perspective, we formulate the problem from an object-level perspective and call for the prediction of full object bounding boxes, rather than trajectories alone. Towards solving this task, we introduce the Citywalks dataset, which consists of over 200k high-resolution video frames. Citywalks comprises of footage recorded in 21 cities from 10 European countries in a variety of weather conditions and over 3.5k unique pedestrian trajectories. For evaluation, we adapt existing trajectory forecasting methods for MOF and confirm cross-dataset generalizability on the MOT-17 dataset without fine-tuning. Finally, we present STED, a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
