Consistent Optimization for Single-Shot Object Detection
Tao Kong, Fuchun Sun, Huaping Liu, Yuning Jiang, Jianbo Shi

TL;DR
This paper introduces consistent optimization for single-stage object detection, aligning training hypotheses with inference to improve accuracy without additional costs, demonstrated on RetinaNet with notable performance gains.
Contribution
It proposes a novel consistent optimization strategy that enhances single-stage detectors by aligning training and inference, leading to significant accuracy improvements.
Findings
Improves RetinaNet from 39.1 AP to 40.1 AP on COCO.
Performance gains are stable across different model capacities.
Consistent optimization is nearly cost-free.
Abstract
We present consistent optimization for single stage object detection. Previous works of single stage object detectors usually rely on the regular, dense sampled anchors to generate hypothesis for the optimization of the model. Through an examination of the behavior of the detector, we observe that the misalignment between the optimization target and inference configurations has hindered the performance improvement. We propose to bride this gap by consistent optimization, which is an extension of the traditional single stage detector's optimization strategy. Consistent optimization focuses on matching the training hypotheses and the inference quality by utilizing of the refined anchors during training. To evaluate its effectiveness, we conduct various design choices based on the state-of-the-art RetinaNet detector. We demonstrate it is the consistent optimization, not the architecture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Video Surveillance and Tracking Methods
Methods1x1 Convolution · Convolution · Feature Pyramid Network · Focal Loss · RetinaNet
