Faster Learning by Reduction of Data Access Time

Vinod Kumar Chauhan; Anuj Sharma; Kalpana Dahiya

arXiv:1801.05931·cs.LG·July 26, 2018

Faster Learning by Reduction of Data Access Time

Vinod Kumar Chauhan, Anuj Sharma, Kalpana Dahiya

PDF

TL;DR

This paper addresses the big data challenge in machine learning by proposing systematic and cyclic sampling methods to reduce data access time, resulting in significantly faster training without sacrificing convergence.

Contribution

It introduces systematic and cyclic sampling techniques for mini-batch selection, demonstrating their effectiveness in speeding up training in empirical risk minimization.

Findings

01

Up to six times faster training times observed.

02

Theoretical convergence proven for proposed sampling methods.

03

Effective on benchmark datasets with strong convexity and smoothness assumptions.

Abstract

Nowadays, the major challenge in machine learning is the Big Data challenge. The big data problems due to large number of data points or large number of features in each data point, or both, the training of models have become very slow. The training time has two major components: Time to access the data and time to process (learn from) the data. So far, the research has focused only on the second part, i.e., learning from the data. In this paper, we have proposed one possible solution to handle the big data problems in machine learning. The idea is to reduce the training time through reducing data access time by proposing systematic sampling and cyclic/sequential sampling to select mini-batches from the dataset. To prove the effectiveness of proposed sampling techniques, we have used Empirical Risk Minimization, which is commonly used machine learning problem, for strongly convex and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSAGA