Rethinking Classification and Localization for Object Detection
Yue Wu, Yinpeng Chen, Lu Yuan, Zicheng Liu, Lijuan Wang, Hongzhi Li,, Yun Fu

TL;DR
This paper analyzes the roles of fully connected and convolutional heads in object detection, revealing their task preferences, and proposes a double-head approach that improves detection accuracy on MS COCO.
Contribution
It provides a detailed understanding of head structures in object detection and introduces a double-head method that enhances performance by leveraging their complementary strengths.
Findings
Fully connected head excels in classification tasks.
Convolution head is better suited for localization.
Proposed double-head method improves AP by 3.5 and 2.8 on MS COCO.
Abstract
Two head structures (i.e. fully connected head and convolution head) have been widely used in R-CNN based detectors for classification and localization tasks. However, there is a lack of understanding of how does these two head structures work for these two tasks. To address this issue, we perform a thorough analysis and find an interesting fact that the two head structures have opposite preferences towards the two tasks. Specifically, the fully connected head (fc-head) is more suitable for the classification task, while the convolution head (conv-head) is more suitable for the localization task. Furthermore, we examine the output feature maps of both heads and find that fc-head has more spatial sensitivity than conv-head. Thus, fc-head has more capability to distinguish a complete object from part of an object, but is not robust to regress the whole object. Based upon these findings,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Rethinking Classification and Localization for Object Detection· youtube
Rethinking Classification and Localization for Object Detection· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Medical Imaging and Analysis
MethodsRegion Proposal Network · *Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Average Pooling · Residual Connection · Softmax · RoIAlign · Mask R-CNN · 1x1 Convolution · Feature Pyramid Network
