Tight Complexity Bounds for Optimizing Composite Objectives
Blake Woodworth, Nathan Srebro

TL;DR
This paper establishes tight bounds on the complexity of optimizing composite convex functions, highlighting differences between deterministic and randomized methods and proposing optimal algorithms for smooth and non-smooth cases.
Contribution
It provides the first tight upper and lower bounds for composite convex optimization and identifies optimal algorithms in various settings.
Findings
Accelerated gradient descent is optimal for smooth functions in deterministic settings.
An accelerated variant of SVRG is optimal in randomized settings.
Prox oracles significantly reduce complexity for non-smooth functions.
Abstract
We provide tight upper and lower bounds on the complexity of minimizing the average of convex functions using gradient and prox oracles of the component functions. We show a significant gap between the complexity of deterministic vs randomized optimization. For smooth functions, we show that accelerated gradient descent (AGD) and an accelerated variant of SVRG are optimal in the deterministic and randomized settings respectively, and that a gradient oracle is sufficient for the optimal rate. For non-smooth functions, having access to prox oracles reduces the complexity and we present optimal methods based on smoothing that improve over methods using just gradient accesses.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and Algorithms
