Stochastic Steffensen method

Minda Zhao; Zehua Lai; and Lek-Heng Lim

arXiv:2211.15310·math.OC·November 29, 2022·1 cites

Stochastic Steffensen method

Minda Zhao, Zehua Lai, and Lek-Heng Lim

PDF

Open Access

TL;DR

This paper introduces a stochastic Steffensen method that achieves super-quadratic convergence without second derivatives, suitable for large-scale optimization, and demonstrates its effectiveness through extensive experiments.

Contribution

It proposes a novel stochastic optimization method based on Steffensen's approach, achieving high convergence orders without hyperparameter tuning and generalizing the randomized Kaczmarz method.

Findings

01

Outperforms existing first-order methods in experiments.

02

Achieves convergence order of approximately 2.414 with optimal step size.

03

Reduces to the randomized Kaczmarz method for quadratic objectives.

Abstract

Is it possible for a first-order method, i.e., only first derivatives allowed, to be quadratically convergent? For univariate loss functions, the answer is yes -- the Steffensen method avoids second derivatives and is still quadratically convergent like Newton method. By incorporating an optimal step size we can even push its convergence order beyond quadratic to $1 + 2 \approx 2.414$ . While such high convergence orders are a pointless overkill for a deterministic algorithm, they become rewarding when the algorithm is randomized for problems of massive sizes, as randomization invariably compromises convergence speed. We will introduce two adaptive learning rates inspired by the Steffensen method, intended for use in a stochastic optimization setting and requires no hyperparameter tuning aside from batch size. Extensive experiments show that they compare favorably with several…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Advanced Bandit Algorithms Research

MethodsStochastic Gradient Descent