Residual Bi-Fusion Feature Pyramid Network for Accurate Single-shot   Object Detection

Ping-Yang Chen; Jun-Wei Hsieh; Chien-Yao Wang; Hong-Yuan Mark Liao,; and Munkhjargal Gochoo

arXiv:1911.12051·cs.CV·December 11, 2019·5 cites

Residual Bi-Fusion Feature Pyramid Network for Accurate Single-shot Object Detection

Ping-Yang Chen, Jun-Wei Hsieh, Chien-Yao Wang, Hong-Yuan Mark Liao,, and Munkhjargal Gochoo

PDF

Open Access

TL;DR

This paper introduces a residual bi-fusion feature pyramid network that enhances object detection accuracy across scales by effectively combining deep and shallow features in a bidirectional manner, outperforming existing methods.

Contribution

It proposes a novel residual bi-fusion feature pyramid that improves multi-scale detection accuracy and ease of training, especially with deeper backbones.

Findings

01

Achieved state-of-the-art results on VOC and MS COCO datasets.

02

Improved detection accuracy for both small and large objects.

03

Enhanced training stability with deeper network layers.

Abstract

State-of-the-art (SoTA) models have improved the accuracy of object detection with a large margin via a FP (feature pyramid). FP is a top-down aggregation to collect semantically strong features to improve scale invariance in both two-stage and one-stage detectors. However, this top-down pathway cannot preserve accurate object positions due to the shift-effect of pooling. Thus, the advantage of FP to improve detection accuracy will disappear when more layers are used. The original FP lacks a bottom-up pathway to offset the lost information from lower-layer feature maps. It performs well in large-sized object detection but poor in small-sized object detection. A new structure "residual feature pyramid" is proposed in this paper. It is bidirectional to fuse both deep and shallow features towards more effective and robust detection for both small-sized and large-sized objects. Due to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning