High-Probability Guarantees for Random Zeroth-Order (Stochastic) Gradient Descent

Haishan Ye

arXiv:2604.23613·math.OC·April 28, 2026

High-Probability Guarantees for Random Zeroth-Order (Stochastic) Gradient Descent

Haishan Ye

PDF

TL;DR

This paper establishes high-probability convergence guarantees for random zeroth-order gradient descent in both deterministic and stochastic settings, providing confidence bounds and query complexity analysis.

Contribution

It offers the first high-probability guarantees for zeroth-order methods, extending classical expectation-based results to probabilistic settings.

Findings

01

Deterministic case: finds an ε-suboptimal solution with high probability using O(dL/μ log(1/ε) + log(1/δ)) queries.

02

Stochastic case: achieves ε-suboptimality with high probability using O(d log(1/ε) (log(1/ε)+log(1/δ))/ε) queries.

03

Provides high-confidence bounds that only add a logarithmic term compared to expectation-based guarantees.

Abstract

Zeroth-order optimization aims to minimize an objective function using only function evaluations, and is therefore fundamental in black-box optimization, hyperparameter tuning, bandit learning, and adversarial machine learning. While classical zeroth-order methods are well understood in expectation, much less is known about their high-probability behavior, especially for smooth and strongly convex objectives. In this paper, we establish high-probability convergence guarantees for random zeroth-order gradient descent in both deterministic and stochastic settings. For deterministic $L$ -smooth and $μ$ -strongly convex objectives of $d$ -dimension, we show that the classical two-query random zeroth-order method finds an $ε$ -suboptimal solution with probability at least $1 - δ$ using \[ \mathcal{O}\left( \frac{dL}{\mu}\log\frac{1}{\varepsilon} + \log\frac{1}{\delta}…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.