MegDet: A Large Mini-Batch Object Detector
Chao Peng, Tete Xiao, Zeming Li, Yuning Jiang, Xiangyu Zhang, Kai Jia,, Gang Yu, Jian Sun

TL;DR
This paper introduces MegDet, a large mini-batch object detector that leverages increased batch sizes and cross-GPU batch normalization to significantly reduce training time and improve accuracy in object detection tasks.
Contribution
The paper proposes a novel training approach with larger mini-batches and a new learning rate policy, enabling faster training and better performance in object detection.
Findings
Training time reduced from 33 hours to 4 hours.
Achieved state-of-the-art accuracy with mmAP 52.5%.
Won 1st place in COCO 2017 Detection Challenge.
Abstract
The improvements in recent CNN-based object detection works, from R-CNN [11], Fast/Faster R-CNN [10, 31] to recent Mask R-CNN [14] and RetinaNet [24], mainly come from new network, new framework, or novel loss design. But mini-batch size, a key factor in the training, has not been well studied. In this paper, we propose a Large MiniBatch Object Detector (MegDet) to enable the training with much larger mini-batch size than before (e.g. from 16 to 256), so that we can effectively utilize multiple GPUs (up to 128 in our experiments) to significantly shorten the training time. Technically, we suggest a learning rate policy and Cross-GPU Batch Normalization, which together allow us to successfully train a large mini-batch detector in much less time (e.g., from 33 hours to 4 hours), and achieve even better accuracy. The MegDet is the backbone of our submission (mmAP 52.5%) to COCO 2017…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
MethodsRegion Proposal Network · Batch Normalization · 1x1 Convolution · Feature Pyramid Network · Focal Loss · RetinaNet · Softmax · Convolution · RoIAlign · Mask R-CNN
