Semi-Stochastic Coordinate Descent
Jakub Kone\v{c}n\'y, Zheng Qu, Peter Richt\'arik

TL;DR
The paper introduces semi-stochastic coordinate descent (S2CD), a new optimization method combining deterministic and stochastic steps, which efficiently minimizes strongly convex functions represented as averages of many smooth convex functions.
Contribution
It presents a novel semi-stochastic coordinate descent algorithm with a unique update scheme and analyzes its complexity, including a new condition number for improved convergence.
Findings
Achieves $O(n ext{log}(1/\epsilon))$ gradient evaluations.
Achieves $O(\hat{\kappa}\text{log}(1/\epsilon))$ partial derivative evaluations.
Progressively improves the stochastic gradient estimate.
Abstract
We propose a novel stochastic gradient method---semi-stochastic coordinate descent (S2CD)---for the problem of minimizing a strongly convex function represented as the average of a large number of smooth convex functions: . Our method first performs a deterministic step (computation of the gradient of at the starting point), followed by a large number of stochastic steps. The process is repeated a few times, with the last stochastic iterate becoming the new starting point where the deterministic step is taken. The novelty of our method is in how the stochastic steps are performed. In each such step, we pick a random function and a random coordinate ---both using nonuniform distributions---and update a single coordinate of the decision vector only, based on the computation of the partial derivative of at two different points.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
