Geometry, Computation, and Optimality in Stochastic Optimization
Chen Cheng, Daniel Levy, John C. Duchi

TL;DR
This paper explores how the geometry of constraint sets influences the optimality of stochastic optimization methods, revealing when simple gradient methods suffice and when more complex nonlinear updates are necessary.
Contribution
It characterizes the problem families where stochastic and adaptive-gradient methods are optimal and identifies the geometric conditions requiring nonlinear updates for optimal convergence.
Findings
Diagonally pre-conditioned stochastic gradient methods are minimax optimal for quadratically convex sets.
The sub-optimality of subgradient methods depends on the 'distance' from quadratic convexity.
Results apply to $ ext{ell}_p$-balls for $p<2$, illustrating computation-accuracy tradeoffs.
Abstract
We study computational and statistical consequences of problem geometry in stochastic and online optimization. By focusing on constraint set and gradient geometry, we characterize the problem families for which stochastic- and adaptive-gradient methods are (minimax) optimal and, conversely, when nonlinear updates -- such as those mirror descent employs -- are necessary for optimal convergence. When the constraint set is quadratically convex, diagonally pre-conditioned stochastic gradient methods are minimax optimal. We provide quantitative converses showing that the ``distance'' of the underlying constraints from quadratic convexity determines the sub-optimality of subgradient methods. These results apply, for example, to any -ball for , and the computation/accuracy tradeoffs they demonstrate exhibit a striking analogy to those in Gaussian sequence models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Optimization Algorithms Research
