Cascade RetinaNet: Maintaining Consistency for Single-Stage Object   Detection

Hongkai Zhang; Hong Chang; Bingpeng Ma; Shiguang Shan; Xilin Chen

arXiv:1907.06881·cs.CV·July 17, 2019·50 cites

Cascade RetinaNet: Maintaining Consistency for Single-Stage Object Detection

Hongkai Zhang, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen

PDF

Open Access

TL;DR

This paper introduces Cas-RetinaNet, a multistage object detector that enhances single-stage detection by improving consistency in classification and localization, leading to significant performance gains on MS COCO.

Contribution

The paper proposes a novel cascade framework with a Feature Consistency Module to address misalignment issues in single-stage detectors, achieving improved accuracy.

Findings

01

Cas-RetinaNet improves AP from 39.1 to 41.1 on MS COCO.

02

The method maintains stable performance across different models and scales.

03

Addressing inconsistency is key to enhancing single-stage detection performance.

Abstract

Recent researches attempt to improve the detection performance by adopting the idea of cascade for single-stage detectors. In this paper, we analyze and discover that inconsistency is the major factor limiting the performance. The refined anchors are associated with the feature extracted from the previous location and the classifier is confused by misaligned classification and localization. Further, we point out two main designing rules for the cascade manner: improving consistency between classification confidence and localization performance, and maintaining feature consistency between different stages. A multistage object detector named Cas-RetinaNet, is then proposed for reducing the misalignments. It consists of sequential stages trained with increasing IoU thresholds for improving the correlation, and a novel Feature Consistency Module for mitigating the feature inconsistency.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques

Methods1x1 Convolution · Convolution · Feature Pyramid Network · Focal Loss · RetinaNet