# Fast Point R-CNN

**Authors:** Yilun Chen, Shu Liu, Xiaoyong Shen, Jiaya Jia

arXiv: 1908.02990 · 2019-08-19

## TL;DR

Fast Point R-CNN introduces a unified two-stage framework for 3D object detection from point clouds, combining voxel and raw point data with attention mechanisms to achieve state-of-the-art accuracy at real-time speeds.

## Contribution

It proposes a novel two-stage detection framework that effectively fuses voxel and raw point cloud features using attention, improving accuracy and efficiency.

## Key findings

- Achieves state-of-the-art results on KITTI dataset.
- Operates at 15 frames per second for real-time detection.
- Effectively combines voxel and raw point data with attention mechanisms.

## Abstract

We present a unified, efficient and effective framework for point-cloud based 3D object detection. Our two-stage approach utilizes both voxel representation and raw point cloud data to exploit respective advantages. The first stage network, with voxel representation as input, only consists of light convolutional operations, producing a small number of high-quality initial predictions. Coordinate and indexed convolutional feature of each point in initial prediction are effectively fused with the attention mechanism, preserving both accurate localization and context information. The second stage works on interior points with their fused feature for further refining the prediction. Our method is evaluated on KITTI dataset, in terms of both 3D and Bird's Eye View (BEV) detection, and achieves state-of-the-arts with a 15FPS detection rate.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.02990/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1908.02990/full.md

## References

43 references — full list in the complete paper: https://tomesphere.com/paper/1908.02990/full.md

---
Source: https://tomesphere.com/paper/1908.02990