SNIPER: Efficient Multi-Scale Training
Bharat Singh, Mahyar Najibi, Larry S. Davis

TL;DR
SNIPER introduces an efficient multi-scale training algorithm for instance-level visual recognition that processes context regions around ground-truth instances, enabling high-resolution training with reduced computational cost and improved batch normalization.
Contribution
The paper presents SNIPER, a novel multi-scale training method that processes only relevant image regions, allowing for high-resolution training with larger batch sizes and better normalization, challenging the need for high-res images in instance recognition.
Findings
Achieves 47.6% mAP on COCO with ResNet-101.
Processes 5 images per second on a single GPU.
Uses only 30% more pixels than single-scale training.
Abstract
We present SNIPER, an algorithm for performing efficient multi-scale training in instance level visual recognition tasks. Instead of processing every pixel in an image pyramid, SNIPER processes context regions around ground-truth instances (referred to as chips) at the appropriate scale. For background sampling, these context-regions are generated using proposals extracted from a region proposal network trained with a short learning schedule. Hence, the number of chips generated per image during training adaptively changes based on the scene complexity. SNIPER only processes 30% more pixels compared to the commonly used single scale training at 800x1333 pixels on the COCO dataset. But, it also observes samples from extreme resolutions of the image pyramid, like 1400x2000 pixels. As SNIPER operates on resampled low resolution chips (512x512 pixels), it can have a batch size as large as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
MethodsAverage Pooling · SNIPER · Weight Decay · Softmax · RoIPool · Faster R-CNN · Region Proposal Network · Residual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution
