RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free
Cheng-Yang Fu, Mykhailo Shvets, Alexander C. Berg

TL;DR
RetinaMask enhances single-shot object detection by integrating mask prediction, adaptive loss, and hard example mining, achieving accuracy comparable to two-stage detectors without additional computational cost.
Contribution
This paper introduces RetinaMask, a novel single-shot detector that incorporates mask prediction and training improvements to match two-stage detector accuracy.
Findings
RetinaMask-101 achieves 41.4 mAP on COCO test-dev.
Adding Group Normalization increases performance to 41.7 mAP.
Detection speed remains comparable to RetinaNet during evaluation.
Abstract
Recently two-stage detectors have surged ahead of single-shot detectors in the accuracy-vs-speed trade-off. Nevertheless single-shot detectors are immensely popular in embedded vision applications. This paper brings single-shot detectors up to the same level as current two-stage techniques. We do this by improving training for the state-of-the-art single-shot detector, RetinaNet, in three ways: integrating instance mask prediction for the first time, making the loss function adaptive and more stable, and including additional hard examples in training. We call the resulting augmented network RetinaMask. The detection component of RetinaMask has the same computational cost as the original RetinaNet, but is more accurate. COCO test-dev results are up to 41.4 mAP for RetinaMask-101 vs 39.1mAP for RetinaNet-101, while the runtime is the same during evaluation. Adding Group Normalization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
MethodsFocal Loss · Average Pooling · RetinaNet · ResNeXt Block · RoIAlign · Non Maximum Suppression · Step Decay · SGD with Momentum · Weight Decay · Self-Adjusting Smooth L1 Loss
