On the Efficacy of Multi-scale Data Samplers for Vision Applications

Elvis Nunez; Thomas Merth; Anish Prabhu; Mehrdad Farajtabar; Mohammad; Rastegari; Sachin Mehta; Maxwell Horton

arXiv:2309.04502·cs.CV·September 12, 2023

On the Efficacy of Multi-scale Data Samplers for Vision Applications

Elvis Nunez, Thomas Merth, Anish Prabhu, Mehrdad Farajtabar, Mohammad, Rastegari, Sachin Mehta, Maxwell Horton

PDF

Open Access

TL;DR

This paper empirically investigates multi-scale data samplers in vision tasks, demonstrating they act as implicit regularizers, improve accuracy, calibration, robustness, and reduce training compute across classification, detection, and segmentation.

Contribution

It provides a comprehensive analysis of variable batch size multi-scale samplers, revealing their regularization effects and practical benefits in training efficiency and model performance.

Findings

01

Multi-scale samplers act as implicit data regularizers.

02

They accelerate training speed and improve model calibration.

03

Achieve over 30% compute reduction and 3-4% mAP increase on MS-COCO.

Abstract

Multi-scale resolution training has seen an increased adoption across multiple vision tasks, including classification and detection. Training with smaller resolutions enables faster training at the expense of a drop in accuracy. Conversely, training with larger resolutions has been shown to improve performance, but memory constraints often make this infeasible. In this paper, we empirically study the properties of multi-scale training procedures. We focus on variable batch size multi-scale data samplers that randomly sample an input resolution at each training iteration and dynamically adjust their batch size according to the resolution. Such samplers have been shown to improve model accuracy beyond standard training with a fixed batch size and resolution, though it is not clear why this is the case. We explore the properties of these data samplers by performing extensive experiments on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Image Processing Techniques and Applications · Domain Adaptation and Few-Shot Learning

MethodsRegion Proposal Network · Softmax · Focus · RoIAlign · Convolution · Mask R-CNN