Task-Balanced Distillation for Object Detection
Ruining Tang, Zhenyu Liu, Yangguang Li, Yiguo Song, Hui Liu, Qide, Wang, Jing Shao, Guifang Duan, Jianrong Tan

TL;DR
This paper introduces Task-Balanced Distillation, a method that improves object detection by aligning classification and regression tasks through harmony scoring and feature distillation, leading to better student model performance.
Contribution
The paper proposes a novel harmony score and task-decoupled feature distillation to address spatial misalignment in object detection knowledge distillation.
Findings
RetinaNet with ResNet-50 achieves 41.0 mAP on COCO with TBD.
TBD outperforms recent distillation methods like FGD and FRS.
The method demonstrates strong generalization across models and datasets.
Abstract
Mainstream object detectors are commonly constituted of two sub-tasks, including classification and regression tasks, implemented by two parallel heads. This classic design paradigm inevitably leads to inconsistent spatial distributions between classification score and localization quality (IOU). Therefore, this paper alleviates this misalignment in the view of knowledge distillation. First, we observe that the massive teacher achieves a higher proportion of harmonious predictions than the lightweight student. Based on this intriguing observation, a novel Harmony Score (HS) is devised to estimate the alignment of classification and regression qualities. HS models the relationship between two sub-tasks and is seen as prior knowledge to promote harmonious predictions for the student. Second, this spatial misalignment will result in inharmonious region selection when distilling features.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods
MethodsFocal Loss · 1x1 Convolution · Convolution · Feature Pyramid Network · RetinaNet
