Monocular 3D Object Detection and Box Fitting Trained End-to-End Using Intersection-over-Union Loss
Eskil J\"orgensen, Christopher Zach, Fredrik Kahl

TL;DR
This paper introduces SS3D, an end-to-end monocular 3D object detection framework that combines a CNN with a 3D bounding box optimizer, achieving state-of-the-art accuracy at real-time speeds for autonomous driving.
Contribution
The paper presents a novel single-stage monocular 3D detection method that models heteroscedastic uncertainty and enables end-to-end training via back-propagation through the optimizer.
Findings
Achieves state-of-the-art accuracy in monocular 3D detection
Runs at 20 fps in a straightforward implementation
End-to-end training improves detection performance
Abstract
Three-dimensional object detection from a single view is a challenging task which, if performed with good accuracy, is an important enabler of low-cost mobile robot perception. Previous approaches to this problem suffer either from an overly complex inference engine or from an insufficient detection accuracy. To deal with these issues, we present SS3D, a single-stage monocular 3D object detector. The framework consists of (i) a CNN, which outputs a redundant representation of each relevant object in the image with corresponding uncertainty estimates, and (ii) a 3D bounding box optimizer. We show how modeling heteroscedastic uncertainty improves performance upon our baseline, and furthermore, how back-propagation can be done through the optimizer in order to train the pipeline end-to-end for additional accuracy. Our method achieves SOTA accuracy on monocular 3D object detection, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Robotics and Sensor-Based Localization
