Using More Data to Speed-up Training Time

Shai Shalev-Shwartz; Ohad Shamir; Eran Tromer

arXiv:1106.1216·cs.LG·June 16, 2011·5 cites

Using More Data to Speed-up Training Time

Shai Shalev-Shwartz, Ohad Shamir, Eran Tromer

PDF

Open Access

TL;DR

This paper investigates how increasing data availability can significantly reduce training time, demonstrating that runtime can decrease exponentially with polynomial data growth, and discusses key techniques and open problems in this area.

Contribution

It provides initial results showing exponential runtime reduction with polynomial data increase and highlights high-level techniques and open problems in data-driven training speedup.

Findings

01

Runtime can decrease exponentially with polynomial data growth

02

Identifies key high-level techniques for reducing training time

03

Outlines open problems for future research

Abstract

In many recent applications, data is plentiful. By now, we have a rather clear understanding of how more data can be used to improve the accuracy of learning algorithms. Recently, there has been a growing interest in understanding how more data can be leveraged to reduce the required training runtime. In this paper, we study the runtime of learning as a function of the number of available training examples, and underscore the main high-level techniques. We provide some initial positive results showing that the runtime can decrease exponentially while only requiring a polynomial growth of the number of examples, and spell-out several interesting open problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Algorithms and Data Compression