Reducing Runtime by Recycling Samples
Jialei Wang, Hai Wang, Nathan Srebro

TL;DR
This paper explores how recycling samples in variance-reduced stochastic methods like SDCA, SAG, and SVRG can reduce runtime, challenging the conventional approach of always using fresh samples and analyzing optimal sample sizes.
Contribution
It introduces the idea of sample recycling in variance-reduction methods and empirically studies its benefits and optimal configurations.
Findings
Reusing samples can reduce runtime in variance-reduction algorithms.
Optimal sample size depends on the method and problem setting.
Running SDCA for an integer number of epochs may be inefficient.
Abstract
Contrary to the situation with stochastic gradient descent, we argue that when using stochastic methods with variance reduction, such as SDCA, SAG or SVRG, as well as their variants, it could be beneficial to reuse previously used samples instead of fresh samples, even when fresh samples are available. We demonstrate this empirically for SDCA, SAG and SVRG, studying the optimal sample size one should use, and also uncover be-havior that suggests running SDCA for an integer number of epochs could be wasteful.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning · Machine Learning and Algorithms
