Hybrid Deterministic-Stochastic Methods for Data Fitting
Michael P. Friedlander, Mark Schmidt

TL;DR
This paper introduces hybrid deterministic-stochastic optimization methods that combine the fast initial progress of incremental algorithms with the steady convergence of full-gradient methods, demonstrated through theoretical analysis and practical experiments.
Contribution
It proposes a novel hybrid approach controlling sample size to achieve steady convergence, blending benefits of incremental and full-gradient methods.
Findings
Hybrid methods maintain steady convergence rates.
Practical quasi-Newton implementation demonstrated.
Numerical experiments show potential benefits.
Abstract
Many structured data-fitting applications require the solution of an optimization problem involving a sum over a potentially large number of measurements. Incremental gradient algorithms offer inexpensive iterations by sampling a subset of the terms in the sum. These methods can make great progress initially, but often slow as they approach a solution. In contrast, full-gradient methods achieve steady convergence at the expense of evaluating the full objective and gradient on each iteration. We explore hybrid methods that exhibit the benefits of both approaches. Rate-of-convergence analysis shows that by controlling the sample size in an incremental gradient algorithm, it is possible to maintain the steady convergence rates of full-gradient methods. We detail a practical quasi-Newton implementation based on this approach. Numerical experiments illustrate its potential benefits.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
