Selective Multi-Scale Learning for Object Detection

Junliang Chen; Weizeng Lu; Linlin Shen

arXiv:2206.08206·cs.CV·June 17, 2022

Selective Multi-Scale Learning for Object Detection

Junliang Chen, Weizeng Lu, Linlin Shen

PDF

Open Access

TL;DR

This paper introduces Selective Multi-Scale Learning (SMSL), a novel feature pyramid network architecture that enhances object detection performance by selectively integrating multi-scale features with minimal additional inference cost.

Contribution

The paper proposes SMSL, a new architecture for feature pyramid networks that improves multi-scale feature integration in object detection models, applicable to both single-stage and two-stage detectors.

Findings

01

RetinaNet with SMSL improves AP by 1.8% on COCO.

02

Two-stage detectors with SMSL gain around 1.0% AP.

03

SMSL adds nearly no extra inference cost.

Abstract

Pyramidal networks are standard methods for multi-scale object detection. Current researches on feature pyramid networks usually adopt layer connections to collect features from certain levels of the feature hierarchy, and do not consider the significant differences among them. We propose a better architecture of feature pyramid networks, named selective multi-scale learning (SMSL), to address this issue. SMSL is efficient and general, which can be integrated in both single-stage and two-stage detectors to boost detection performance, with nearly no extra inference cost. RetinaNet combined with SMSL obtains 1.8\% improvement in AP (from 39.1\% to 40.9\%) on COCO dataset. When integrated with SMSL, two-stage detectors can get around 1.0\% improvement in AP.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRemote-Sensing Image Classification · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques

MethodsFeature Pyramid Network · 1x1 Convolution · Focal Loss · Convolution · RetinaNet