# Improving 3D Object Detection for Pedestrians with Virtual Multi-View   Synthesis Orientation Estimation

**Authors:** Jason Ku, Alex D. Pon, Sean Walsh, and Steven L. Waslander

arXiv: 1907.06777 · 2019-07-17

## TL;DR

This paper introduces a Virtual Multi-View Synthesis module that enhances 3D pedestrian orientation estimation by generating novel viewpoints from densified point clouds, significantly improving performance on the KITTI benchmark.

## Contribution

The paper proposes a novel Virtual Multi-View Synthesis module that can be integrated into existing 3D detection methods to improve pedestrian orientation estimation.

## Key findings

- Significant improvement in pedestrian orientation accuracy on KITTI benchmark.
- Outperforms all published methods on pedestrian Orientation, 3D, and Bird's Eye View benchmarks.
- Effective integration with AVOD-FPN detector enhances overall detection performance.

## Abstract

Accurately estimating the orientation of pedestrians is an important and challenging task for autonomous driving because this information is essential for tracking and predicting pedestrian behavior. This paper presents a flexible Virtual Multi-View Synthesis module that can be adopted into 3D object detection methods to improve orientation estimation. The module uses a multi-step process to acquire the fine-grained semantic information required for accurate orientation estimation. First, the scene's point cloud is densified using a structure preserving depth completion algorithm and each point is colorized using its corresponding RGB pixel. Next, virtual cameras are placed around each object in the densified point cloud to generate novel viewpoints, which preserve the object's appearance. We show that this module greatly improves the orientation estimation on the challenging pedestrian class on the KITTI benchmark. When used with the open-source 3D detector AVOD-FPN, we outperform all other published methods on the pedestrian Orientation, 3D, and Bird's Eye View benchmarks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.06777/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1907.06777/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/1907.06777/full.md

---
Source: https://tomesphere.com/paper/1907.06777