# Learning 2D to 3D Lifting for Object Detection in 3D for Autonomous   Vehicles

**Authors:** Siddharth Srivastava, Frederic Jurie, Gaurav Sharma

arXiv: 1904.08494 · 2019-10-14

## TL;DR

This paper introduces a novel method for 3D object detection in autonomous vehicles by lifting 2D images into 3D representations using neural networks, achieving superior results on the KITTI benchmark.

## Contribution

The authors propose a new 2D to 3D lifting approach that outperforms many existing 3D detection methods and can serve as a backup to physical 3D sensors.

## Key findings

- Outperforms recent 3D detection networks on KITTI benchmark.
- Late fusion of generated and real 3D data improves detection accuracy.
- Method offers a cost-effective alternative to physical 3D sensors in autonomous driving.

## Abstract

We address the problem of 3D object detection from 2D monocular images in autonomous driving scenarios. We propose to lift the 2D images to 3D representations using learned neural networks and leverage existing networks working directly on 3D data to perform 3D object detection and localization. We show that, with carefully designed training mechanism and automatically selected minimally noisy data, such a method is not only feasible, but gives higher results than many methods working on actual 3D inputs acquired from physical sensors. On the challenging KITTI benchmark, we show that our 2D to 3D lifted method outperforms many recent competitive 3D networks while significantly outperforming previous state-of-the-art for 3D detection from monocular images. We also show that a late fusion of the output of the network trained on generated 3D images, with that trained on real 3D images, improves performance. We find the results very interesting and argue that such a method could serve as a highly reliable backup in case of malfunction of expensive 3D sensors, if not potentially making them redundant, at least in the case of low human injury risk autonomous navigation scenarios like warehouse automation.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.08494/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1904.08494/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/1904.08494/full.md

---
Source: https://tomesphere.com/paper/1904.08494