Enhanced Derivative-Free Optimization Using Adaptive Correlation-Induced   Finite Difference Estimators

Guo Liang; Guangwu Liu; Kun Zhang

arXiv:2502.20819·math.OC·March 3, 2025

Enhanced Derivative-Free Optimization Using Adaptive Correlation-Induced Finite Difference Estimators

Guo Liang, Guangwu Liu, Kun Zhang

PDF

TL;DR

This paper introduces an adaptive correlation-induced finite difference estimator for derivative-free optimization, improving gradient and sample efficiency while maintaining convergence rates, and demonstrates its effectiveness through numerical experiments.

Contribution

It proposes a novel batch-based FD estimator with adaptive sampling and a stochastic line search, enhancing DFO's efficiency and convergence performance.

Findings

01

Achieves the same convergence rate as KW and SPSA methods.

02

Demonstrates superior empirical performance in numerical experiments.

03

Provides a consistent and efficient alternative for gradient estimation in DFO.

Abstract

Gradient-based methods are well-suited for derivative-free optimization (DFO), where finite-difference (FD) estimates are commonly used as gradient surrogates. Traditional stochastic approximation methods, such as Kiefer-Wolfowitz (KW) and simultaneous perturbation stochastic approximation (SPSA), typically utilize only two samples per iteration, resulting in imprecise gradient estimates and necessitating diminishing step sizes for convergence. In this paper, we first explore an efficient FD estimate, referred to as correlation-induced FD estimate, which is a batch-based estimate. Then, we propose an adaptive sampling strategy that dynamically determines the batch size at each iteration. By combining these two components, we develop an algorithm designed to enhance DFO in terms of both gradient estimation efficiency and sample efficiency. Furthermore, we establish the consistency of our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.