Scale-Aware Trident Networks for Object Detection

Yanghao Li; Yuntao Chen; Naiyan Wang; Zhaoxiang Zhang

arXiv:1901.01892·cs.CV·August 21, 2019·90 cites

Scale-Aware Trident Networks for Object Detection

Yanghao Li, Yuntao Chen, Naiyan Wang, Zhaoxiang Zhang

PDF

Open Access 4 Repos

TL;DR

This paper introduces TridentNet, a scale-aware multi-branch architecture for object detection that improves performance by generating scale-specific features and training with scale-aware sampling, achieving state-of-the-art results.

Contribution

The paper proposes a novel TridentNet architecture with scale-specific branches and a scale-aware training scheme, enhancing object detection across varying object sizes.

Findings

01

Achieves 48.4 mAP on COCO with ResNet-101 backbone.

02

Scale-aware training improves detection of objects at different scales.

03

Fast approximation version maintains performance without extra computational cost.

Abstract

Scale variation is one of the key challenges in object detection. In this work, we first present a controlled experiment to investigate the effect of receptive fields for scale variation in object detection. Based on the findings from the exploration experiments, we propose a novel Trident Network (TridentNet) aiming to generate scale-specific feature maps with a uniform representational power. We construct a parallel multi-branch architecture in which each branch shares the same transformation parameters but with different receptive fields. Then, we adopt a scale-aware training scheme to specialize each branch by sampling object instances of proper scales for training. As a bonus, a fast approximation version of TridentNet could achieve significant improvements without any additional parameters and computational cost compared with the vanilla detector. On the COCO dataset, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsAverage Pooling · Residual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Dilated Convolution · Step Decay · Batch Normalization · Random Horizontal Flip · Soft-NMS · TridentNet Block