On the Almost Sure Convergence of the Stochastic Three Points Algorithm

Taha El Bakkali El Kadi; Omar Saadi

arXiv:2501.13886·math.OC·February 11, 2026·ICLR

On the Almost Sure Convergence of the Stochastic Three Points Algorithm

Taha El Bakkali El Kadi, Omar Saadi

PDF

Open Access 3 Reviews

TL;DR

This paper establishes the first almost sure convergence results for the stochastic three points (STP) algorithm across various classes of smooth functions, providing convergence rates in expectation and almost surely for non-convex, convex, and strongly convex cases.

Contribution

It provides the first almost sure convergence analysis of the STP algorithm for different classes of smooth functions, including convergence rates and conditions.

Findings

01

Almost sure convergence of the gradient to zero for non-convex functions.

02

Convergence of function values to the minimum for convex functions.

03

Linear convergence rate for strongly convex functions.

Abstract

The stochastic three points (STP) algorithm is a derivative-free optimization technique designed for unconstrained optimization problems in $R^{d}$ . In this paper, we analyze this algorithm for three classes of functions: smooth functions that may lack convexity, smooth convex functions, and smooth functions that are strongly convex. Our work provides the first almost sure convergence results of the STP algorithm, alongside some convergence results in expectation. For the class of smooth functions, we establish that the best gradient iterate of the STP algorithm converges almost surely to zero at a rate of $o (1/ T^{\frac{1}{2} - ϵ})$ for any $ϵ \in (0, \frac{1}{2})$ , where $T$ is the number of iterations. Furthermore, within the same class of functions, we establish both almost sure convergence and convergence in expectation of the final gradient iterate towards zero.…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 6Confidence 3

Strengths

This paper provides the first almost-sure convergence analysis of the STP method for three standard classes of functions, a non-trivial achievement. This analysis guarantees convergence of each trajectory instance, making it valuable both theoretically and practically.

Weaknesses

- Motivation: The focus on STP in this paper is not well motivated in the abstract and introduction, aside from mentiong its low per-iteration complexity. Is the STP widely used in practice? Given its simplicity and strong theoretical guarantees, I cannot see why this would not be considered by practitioners. I suggest that the authors further elaborate on the theoretical and practical importance of studying the STP. - Comparison to other existing derivative-free methods: Although this paper fo

Reviewer 02Rating 6Confidence 3

Strengths

The paper is well-written, with technical results clearly presented and explained. Also, different settings are considered in the paper.

Weaknesses

My major concern is the technical novelty of the paper. 1. Insufficient technical contribution. Although the paper considers multiple settings and gets almost-sure convergence results, the techniques seem to be a combination of [1] and other papers that establish almost-sure convergence of SGD in different settings. Especially the results in section 3 and 4 look less surprising, since most of them are obtained by verifying some conditions and invoking an existing almost-sure convergence res

Reviewer 03Rating 6Confidence 3

Strengths

Paper is well-written. Convergence rates of the STP algorithm for smooth, smooth convex and smooth strongly convex functions are shown.

Weaknesses

The results are incremental. How does the convergence rate of STP compare with other methods both in expectation and almost sure. For example with RGF or GLD which is compared in the experiments ? This could be discussed or given as a table. What about the optimal convergence rate in each case ? A brief discussion about the advantages of STP algorithm in the introduction would be helpful.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMatrix Theory and Algorithms · Advanced Optimization Algorithms Research