AdaSelection: Accelerating Deep Learning Training through Data Subsampling
Minghe Zhang, Chaosheng Dong, Jinmiao Fu, Tianchen Zhou, Jia Liang,, Jia Liu, Bo Liu, Michinari Momma, Bryan Wang, Yan Gao, Yi Sun

TL;DR
AdaSelection is an adaptive data subsampling technique that accelerates deep learning training by selecting the most informative samples, improving efficiency without sacrificing accuracy across various tasks.
Contribution
The paper introduces AdaSelection, a novel adaptive subsampling method that combines multiple importance measures to enhance training speed in large-scale deep learning models.
Findings
Outperforms industry-standard baselines in classification and regression tasks
Speeds up training without loss of model performance
Effective across image and language datasets
Abstract
In this paper, we introduce AdaSelection, an adaptive sub-sampling method to identify the most informative sub-samples within each minibatch to speed up the training of large-scale deep learning models without sacrificing model performance. Our method is able to flexibly combines an arbitrary number of baseline sub-sampling methods incorporating the method-level importance and intra-method sample-level importance at each iteration. The standard practice of ad-hoc sampling often leads to continuous training with vast amounts of data from production environments. To improve the selection of data instances during forward and backward passes, we propose recording a constant amount of information per instance from these passes. We demonstrate the effectiveness of our method by testing it across various types of inputs and tasks, including the classification tasks on both image and language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
