# R$^2$-CNN: Fast Tiny Object Detection in Large-Scale Remote Sensing   Images

**Authors:** Jiangmiao Pang, Cong Li, Jianping Shi, Zhihai Xu, Huajun Feng

arXiv: 1902.06042 · 2019-04-02

## TL;DR

This paper introduces R$^2$-CNN, a fast and efficient neural network designed specifically for tiny object detection in large-scale remote sensing images, addressing speed and false alarm challenges.

## Contribution

The paper proposes a novel unified network architecture with a lightweight backbone and global attention, enabling real-time tiny object detection in extremely large remote sensing images.

## Key findings

- Processed a 18000x18192 pixel image in 29.4 seconds.
- Achieved effective tiny object detection on large-scale remote sensing images.
- Outperformed previous methods in speed and accuracy for this task.

## Abstract

Recently, the convolutional neural network has brought impressive improvements for object detection. However, detecting tiny objects in large-scale remote sensing images still remains challenging. First, the extreme large input size makes the existing object detection solutions too slow for practical use. Second, the massive and complex backgrounds cause serious false alarms. Moreover, the ultratiny objects increase the difficulty of accurate detection. To tackle these problems, we propose a unified and self-reinforced network called remote sensing region-based convolutional neural network ($\mathcal{R}^2$-CNN), composing of backbone Tiny-Net, intermediate global attention block, and final classifier and detector. Tiny-Net is a lightweight residual structure, which enables fast and powerful features extraction from inputs. Global attention block is built upon Tiny-Net to inhibit false positives. Classifier is then used to predict the existence of targets in each patch, and detector is followed to locate them accurately if available. The classifier and detector are mutually reinforced with end-to-end training, which further speed up the process and avoid false alarms. Effectiveness of $\mathcal{R}^2$-CNN is validated on hundreds of GF-1 images and GF-2 images that are 18 000 $\times$ 18 192 pixels, 2.0-m resolution, and 27 620 $\times$ 29 200 pixels, 0.8-m resolution, respectively. Specifically, we can process a GF-1 image in 29.4 s on Titian X just with single thread. According to our knowledge, no previous solution can detect the tiny object on such huge remote sensing images gracefully. We believe that it is a significant step toward practical real-time remote sensing systems.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.06042/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1902.06042/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/1902.06042/full.md

---
Source: https://tomesphere.com/paper/1902.06042