Convergence Rates of Stochastic Zeroth-order Gradient Descent for \L   ojasiewicz Functions

Tianyu Wang; Yasong Feng

arXiv:2210.16997·math.OC·April 20, 2023

Convergence Rates of Stochastic Zeroth-order Gradient Descent for \L ojasiewicz Functions

Tianyu Wang, Yasong Feng

PDF

Open Access

TL;DR

This paper establishes convergence rates for Stochastic Zeroth-order Gradient Descent algorithms applied to Lojasiewicz functions, demonstrating faster convergence of function values than iterates, applicable to both smooth and nonsmooth cases.

Contribution

It provides the first convergence rate analysis of SZGD algorithms for Lojasiewicz functions, including non-smooth cases, expanding understanding of zeroth-order optimization.

Findings

01

Function values converge faster than iterates.

02

Convergence rates hold for both smooth and nonsmooth functions.

03

Results apply to a broad class of Lojasiewicz functions.

Abstract

We prove convergence rates of Stochastic Zeroth-order Gradient Descent (SZGD) algorithms for Lojasiewicz functions. The SZGD algorithm iterates as \begin{align*} \mathbf{x}_{t+1} = \mathbf{x}_t - \eta_t \widehat{\nabla} f (\mathbf{x}_t), \qquad t = 0,1,2,3,\cdots , \end{align*} where $f$ is the objective function that satisfies the \L ojasiewicz inequality with \L ojasiewicz exponent $θ$ , $η_{t}$ is the step size (learning rate), and $\nabla f (x_{t})$ is the approximate gradient estimated using zeroth-order information only. Our results show that ${f (x_{t}) - f (x_{\infty})}_{t \in N}$ can converge faster than ${∥ x_{t} - x_{\infty} ∥}_{t \in N}$ , regardless of whether the objective $f$ is smooth or nonsmooth.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Stochastic processes and financial applications