# Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D   vehicle analysis from monocular image

**Authors:** Florian Chabot, Mohamed Chaouch, Jaonary Rabarisoa, C\'eline, Teuli\`ere, Thierry Chateau

arXiv: 1703.07570 · 2017-03-24

## TL;DR

Deep MANTA is a multi-task convolutional network that performs vehicle detection, part localization, visibility analysis, and 3D dimension estimation from monocular images, enabling accurate 3D vehicle analysis.

## Contribution

It introduces a novel coarse-to-fine proposal architecture and a multi-task network capable of localizing vehicle parts and estimating 3D pose from monocular images.

## Key findings

- Outperforms state-of-the-art methods on KITTI benchmark
- Achieves accurate 3D localization and orientation estimation
- Operates in real-time

## Abstract

In this paper, we present a novel approach, called Deep MANTA (Deep Many-Tasks), for many-task vehicle analysis from a given image. A robust convolutional network is introduced for simultaneous vehicle detection, part localization, visibility characterization and 3D dimension estimation. Its architecture is based on a new coarse-to-fine object proposal that boosts the vehicle detection. Moreover, the Deep MANTA network is able to localize vehicle parts even if these parts are not visible. In the inference, the network's outputs are used by a real time robust pose estimation algorithm for fine orientation estimation and 3D vehicle localization. We show in experiments that our method outperforms monocular state-of-the-art approaches on vehicle detection, orientation and 3D location tasks on the very challenging KITTI benchmark.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.07570/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1703.07570/full.md

## References

47 references — full list in the complete paper: https://tomesphere.com/paper/1703.07570/full.md

---
Source: https://tomesphere.com/paper/1703.07570