R-FCN-3000 at 30fps: Decoupling Detection and Classification

Bharat Singh; Hengduo Li; Abhishek Sharma; Larry S. Davis

arXiv:1712.01802·cs.CV·December 6, 2017

R-FCN-3000 at 30fps: Decoupling Detection and Classification

Bharat Singh, Hengduo Li, Abhishek Sharma, Larry S. Davis

PDF

2 Repos

TL;DR

R-FCN-3000 is a real-time, large-scale object detector that decouples detection and classification, achieving high accuracy and speed, and demonstrating generalization to new classes.

Contribution

It introduces a novel decoupled detection and classification architecture based on R-FCN, enabling efficient and accurate large-scale object detection at 30fps.

Findings

01

Achieves 34.9% mAP on ImageNet detection dataset.

02

Outperforms YOLO-9000 by 18% in detection accuracy.

03

Objectness learned generalizes to novel classes.

Abstract

We present R-FCN-3000, a large-scale real-time object detector in which objectness detection and classification are decoupled. To obtain the detection score for an RoI, we multiply the objectness score with the fine-grained classification score. Our approach is a modification of the R-FCN architecture in which position-sensitive filters are shared across different object classes for performing localization. For fine-grained classification, these position-sensitive filters are not needed. R-FCN-3000 obtains an mAP of 34.9% on the ImageNet detection dataset and outperforms YOLO-9000 by 18% while processing 30 images per second. We also show that the objectness learned by R-FCN-3000 generalizes to novel classes and the performance increases with the number of training object classes - supporting the hypothesis that it is possible to learn a universal objectness detector. Code will be made…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPosition-Sensitive RoI Pooling · Convolution · Region-based Fully Convolutional Network