One Backward from Ten Forward, Subsampling for Large-Scale Deep Learning

Chaosheng Dong; Xiaojie Jin; Weihao Gao; Yijia Wang; Hongyi Zhang,; Xiang Wu; Jianchao Yang; Xiaobing Liu

arXiv:2104.13114·cs.LG·April 28, 2021·1 cites

One Backward from Ten Forward, Subsampling for Large-Scale Deep Learning

Chaosheng Dong, Xiaojie Jin, Weihao Gao, Yijia Wang, Hongyi Zhang,, Xiang Wu, Jianchao Yang, Xiaobing Liu

PDF

Open Access

TL;DR

This paper introduces a novel subsampling method for large-scale deep learning that leverages information from inference passes to improve data selection, enhancing training efficiency and effectiveness.

Contribution

It proposes a new framework and algorithm that utilize forward pass information during inference to better select training data, addressing the challenge of large-scale streaming data.

Findings

01

Improved data sampling leads to better model training efficiency.

02

The method outperforms traditional ad-hoc sampling baselines.

03

Effective on large-scale classification and regression tasks.

Abstract

Deep learning models in large-scale machine learning systems are often continuously trained with enormous data from production environments. The sheer volume of streaming training data poses a significant challenge to real-time training subsystems and ad-hoc sampling is the standard practice. Our key insight is that these deployed ML systems continuously perform forward passes on data instances during inference, but ad-hoc sampling does not take advantage of this substantial computational effort. Therefore, we propose to record a constant amount of information per instance from these forward passes. The extra information measurably improves the selection of which data instances should participate in forward and backward passes. A novel optimization framework is proposed to analyze this problem and we provide an efficient approximation algorithm under the framework of Mini-batch gradient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Advanced Neural Network Applications · Stochastic Gradient Optimization Techniques