End-to-end Learning of Multi-sensor 3D Tracking by Detection
Davi Frossard, Raquel Urtasun

TL;DR
This paper introduces an end-to-end learning framework that combines camera and LIDAR data for accurate 3D object tracking, formulated as a linear program and trained with convolutional networks, achieving competitive results on KITTI.
Contribution
It presents a novel end-to-end approach for multi-sensor 3D tracking that jointly learns detection and matching, integrating multiple sensor modalities.
Findings
Achieves high accuracy in 3D tracking on KITTI dataset.
Formulates tracking as a linear program for exact solutions.
Demonstrates competitive performance with state-of-the-art methods.
Abstract
In this paper we propose a novel approach to tracking by detection that can exploit both cameras as well as LIDAR data to produce very accurate 3D trajectories. Towards this goal, we formulate the problem as a linear program that can be solved exactly, and learn convolutional networks for detection as well as matching in an end-to-end manner. We evaluate our model in the challenging KITTI dataset and show very competitive results.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
