SWA Object Detection
Haoyang Zhang, Ying Wang, Feras Dayoub, Niko S\"underhauf

TL;DR
This paper demonstrates that applying Stochastic Weights Averaging (SWA) with a simple 12-epoch cyclical learning rate schedule consistently improves object detection performance by about 1.0 AP across various models on the COCO benchmark, without additional inference costs.
Contribution
The study systematically applies SWA to object detection and segmentation, revealing a simple, effective policy that enhances accuracy without changing the detector architecture.
Findings
Achieves approximately 1.0 AP improvement across multiple detectors.
Effective application of SWA with 12 extra training epochs and cyclical learning rates.
No additional inference cost required.
Abstract
Do you want to improve 1.0 AP for your object detector without any inference cost and any change to your detector? Let us tell you such a recipe. It is surprisingly simple: train your detector for an extra 12 epochs using cyclical learning rates and then average these 12 checkpoints as your final detection model}. This potent recipe is inspired by Stochastic Weights Averaging (SWA), which is proposed in arXiv:1803.05407 for improving generalization in deep neural networks. We found it also very effective in object detection. In this technique report, we systematically investigate the effects of applying SWA to object detection as well as instance segmentation. Through extensive experiments, we discover the aforementioned workable policy of performing SWA in object detection, and we consistently achieve 1.0 AP improvement over various popular detectors on the challenging COCO…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
MethodsAverage Pooling · Feature Pyramid Network · Varifocal Loss · Convolution · Non Maximum Suppression · Global Average Pooling · FCOS · Softmax · k-Means Clustering · 1x1 Convolution
