TL;DR
R-FCN-3000 is a real-time, large-scale object detector that decouples detection and classification, achieving high accuracy and speed, and demonstrating generalization to new classes.
Contribution
It introduces a novel decoupled detection and classification architecture based on R-FCN, enabling efficient and accurate large-scale object detection at 30fps.
Findings
Achieves 34.9% mAP on ImageNet detection dataset.
Outperforms YOLO-9000 by 18% in detection accuracy.
Objectness learned generalizes to novel classes.
Abstract
We present R-FCN-3000, a large-scale real-time object detector in which objectness detection and classification are decoupled. To obtain the detection score for an RoI, we multiply the objectness score with the fine-grained classification score. Our approach is a modification of the R-FCN architecture in which position-sensitive filters are shared across different object classes for performing localization. For fine-grained classification, these position-sensitive filters are not needed. R-FCN-3000 obtains an mAP of 34.9% on the ImageNet detection dataset and outperforms YOLO-9000 by 18% while processing 30 images per second. We also show that the objectness learned by R-FCN-3000 generalizes to novel classes and the performance increases with the number of training object classes - supporting the hypothesis that it is possible to learn a universal objectness detector. Code will be made…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPosition-Sensitive RoI Pooling · Convolution · Region-based Fully Convolutional Network
