Combined Image- and World-Space Tracking in Traffic Scenes
Aljosa Osep, Wolfgang Mehner, Markus Mathias, Bastian Leibe

TL;DR
This paper introduces a novel joint image- and world-space tracking method for urban scenes, enhancing 3D localization accuracy in autonomous driving by integrating 2D and 3D data throughout the tracking pipeline.
Contribution
It presents a new coupled 2D-3D Kalman filter and a extendable hypothesize-and-select framework for continuous tracking in both image and world spaces.
Findings
Achieves state-of-the-art performance on KITTI benchmark.
Significantly improves 3D localization precision.
Effectively combines 2D and 3D information throughout the tracking process.
Abstract
Tracking in urban street scenes plays a central role in autonomous systems such as self-driving cars. Most of the current vision-based tracking methods perform tracking in the image domain. Other approaches, eg based on LIDAR and radar, track purely in 3D. While some vision-based tracking methods invoke 3D information in parts of their pipeline, and some 3D-based methods utilize image-based information in components of their approach, we propose to use image- and world-space information jointly throughout our method. We present our tracking pipeline as a 3D extension of image-based tracking. From enhancing the detections with 3D measurements to the reported positions of every tracked object, we use world-space 3D information at every stage of processing. We accomplish this by our novel coupled 2D-3D Kalman filter, combined with a conceptually clean and extendable hypothesize-and-select…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Video Surveillance and Tracking Methods · Advanced Optical Sensing Technologies
