Light-Head R-CNN: In Defense of Two-Stage Object Detector
Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun

TL;DR
This paper introduces Light-Head R-CNN, a two-stage object detector with a lightweight head that achieves high accuracy and speed, outperforming both traditional two-stage and single-stage detectors on COCO.
Contribution
The paper proposes a novel lightweight head design for two-stage detectors, significantly improving speed without sacrificing accuracy.
Findings
Light-Head R-CNN outperforms state-of-the-art detectors on COCO.
Achieves 30.7 mAP at 102 FPS with a tiny backbone.
Outperforms YOLO and SSD in both speed and accuracy.
Abstract
In this paper, we first investigate why typical two-stage methods are not as fast as single-stage, fast detectors like YOLO and SSD. We find that Faster R-CNN and R-FCN perform an intensive computation after or before RoI warping. Faster R-CNN involves two fully connected layers for RoI recognition, while R-FCN produces a large score maps. Thus, the speed of these networks is slow due to the heavy-head design in the architecture. Even if we significantly reduce the base model, the computation cost cannot be largely decreased accordingly. We propose a new two-stage detector, Light-Head R-CNN, to address the shortcoming in current two-stage approaches. In our design, we make the head of network as light as possible, by using a thin feature map and a cheap R-CNN subnet (pooling and single fully-connected layer). Our ResNet-101 based light-head R-CNN outperforms state-of-art object…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Adversarial Robustness in Machine Learning
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · *Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Average Pooling · Residual Connection · Global Average Pooling · Bottleneck Residual Block · Residual Block · Kaiming Initialization · Bitcoin Customer Service Number +1-833-534-1729
