Stochastic Zeroth order Descent with Structured Directions
Marco Rando, Cesare Molinari, Silvia Villa, Lorenzo Rosasco

TL;DR
This paper introduces a stochastic zeroth order optimization method using multiple structured directions, providing convergence guarantees for convex and certain non-convex functions, with practical applications in hyper-parameter tuning.
Contribution
It proposes a novel structured stochastic zeroth order descent method with convergence analysis for convex and Polyak-Łojasiewicz functions, extending zeroth order optimization theory.
Findings
Convergence rate close to stochastic gradient descent for convex functions.
First convergence rates established for non-convex functions under Polyak-Łojasiewicz condition.
Competitive performance in hyper-parameter optimization tasks.
Abstract
We introduce and analyze Structured Stochastic Zeroth order Descent (S-SZD), a finite difference approach that approximates a stochastic gradient on a set of orthogonal directions, where is the dimension of the ambient space. These directions are randomly chosen and may change at each step. For smooth convex functions we prove almost sure convergence of the iterates and a convergence rate on the function values of the form for every , which is arbitrarily close to the one of Stochastic Gradient Descent (SGD) in terms of number of iterations. Our bound shows the benefits of using multiple directions instead of one. For non-convex functions satisfying the Polyak-{\L}ojasiewicz condition, we establish the first convergence rates for stochastic structured zeroth order algorithms under such an assumption. We corroborate our theoretical findings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Statistical Methods and Inference
