RetinaMask: Learning to predict masks improves state-of-the-art   single-shot detection for free

Cheng-Yang Fu; Mykhailo Shvets; Alexander C. Berg

arXiv:1901.03353·cs.CV·January 14, 2019·119 cites

RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free

Cheng-Yang Fu, Mykhailo Shvets, Alexander C. Berg

PDF

Open Access 5 Repos

TL;DR

RetinaMask enhances single-shot object detection by integrating mask prediction, adaptive loss, and hard example mining, achieving accuracy comparable to two-stage detectors without additional computational cost.

Contribution

This paper introduces RetinaMask, a novel single-shot detector that incorporates mask prediction and training improvements to match two-stage detector accuracy.

Findings

01

RetinaMask-101 achieves 41.4 mAP on COCO test-dev.

02

Adding Group Normalization increases performance to 41.7 mAP.

03

Detection speed remains comparable to RetinaNet during evaluation.

Abstract

Recently two-stage detectors have surged ahead of single-shot detectors in the accuracy-vs-speed trade-off. Nevertheless single-shot detectors are immensely popular in embedded vision applications. This paper brings single-shot detectors up to the same level as current two-stage techniques. We do this by improving training for the state-of-the-art single-shot detector, RetinaNet, in three ways: integrating instance mask prediction for the first time, making the loss function adaptive and more stable, and including additional hard examples in training. We call the resulting augmented network RetinaMask. The detection component of RetinaMask has the same computational cost as the original RetinaNet, but is more accurate. COCO test-dev results are up to 41.4 mAP for RetinaMask-101 vs 39.1mAP for RetinaNet-101, while the runtime is the same during evaluation. Adding Group Normalization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning

MethodsFocal Loss · Average Pooling · RetinaNet · ResNeXt Block · RoIAlign · Non Maximum Suppression · Step Decay · SGD with Momentum · Weight Decay · Self-Adjusting Smooth L1 Loss