G-RCN: Optimizing the Gap between Classification and Localization Tasks   for Object Detection

Yufan Luo; Li Xiao

arXiv:2012.03677·cs.CV·December 8, 2020·1 cites

G-RCN: Optimizing the Gap between Classification and Localization Tasks for Object Detection

Yufan Luo, Li Xiao

PDF

Open Access

TL;DR

This paper introduces G-RCN, a novel object detection paradigm that separates classification and localization tasks to optimize their performance, leading to significant improvements on standard datasets with minimal structural changes.

Contribution

The paper proposes G-RCN, a new multi-task learning approach that separates classification and localization tasks, improving accuracy without adding extra modules.

Findings

01

G-RCN improves AP70 by 3.6 on PASCAL VOC with ResNet50.

02

G-RCN enhances AP by 1.5 on COCO with ResNet50.

03

Applying G-RCN to various backbones yields over 2.0 AP70 improvement.

Abstract

Multi-task learning is widely used in computer vision. Currently, object detection models utilize shared feature map to complete classification and localization tasks simultaneously. By comparing the performance between the original Faster R-CNN and that with partially separated feature maps, we show that: (1) Sharing high-level features for the classification and localization tasks is sub-optimal; (2) Large stride is beneficial for classification but harmful for localization; (3) Global context information could improve the performance of classification. Based on these findings, we proposed a paradigm called Gap-optimized region based convolutional network (G-RCN), which aims to separating these two tasks and optimizing the gap between them. The paradigm was firstly applied to correct the current ResNet protocol by simply reducing the stride and moving the Conv5 block from the head to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsRoIPool · Region Proposal Network · 1x1 Convolution · Convolution · Max Pooling · Kaiming Initialization · Softmax · *Communicated@Fast*How Do I Communicate to Expedia? · Faster R-CNN · Residual Connection