Prediction-Guided Distillation for Dense Object Detection
Chenhongyi Yang, Mateusz Ochal, Amos Storkey, Elliot J. Crowley

TL;DR
This paper introduces Prediction-Guided Distillation (PGD), a novel method that enhances small object detection models by focusing knowledge transfer on the most informative regions within ground-truth bounding boxes, leading to significant accuracy improvements.
Contribution
The paper proposes PGD, a new distillation approach that targets key predictive regions in teacher models, with adaptive weighting, outperforming existing methods on multiple detection benchmarks.
Findings
Achieves +3.1% to +4.6% AP improvement on COCO dataset.
Outperforms state-of-the-art KD baselines across various detection architectures.
Effective on multiple datasets including CrowdHuman.
Abstract
Real-world object detection models should be cheap and accurate. Knowledge distillation (KD) can boost the accuracy of a small, cheap detection model by leveraging useful information from a larger teacher model. However, a key challenge is identifying the most informative features produced by the teacher for distillation. In this work, we show that only a very small fraction of features within a ground-truth bounding box are responsible for a teacher's high detection performance. Based on this, we propose Prediction-Guided Distillation (PGD), which focuses distillation on these key predictive regions of the teacher and yields considerable gains in performance over many existing KD baselines. In addition, we propose an adaptive weighting scheme over the key regions to smooth out their influence and achieve even better performance. Our proposed approach outperforms current…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsKnowledge Distillation
