Refining Adaptive Zeroth-Order Optimization at Ease

Yao Shu; Qixin Zhang; Kun He; Zhongxiang Dai

arXiv:2502.01014·cs.LG·June 10, 2025

Refining Adaptive Zeroth-Order Optimization at Ease

Yao Shu, Qixin Zhang, Kun He, Zhongxiang Dai

PDF

Open Access

TL;DR

This paper introduces R-AdaZO, a novel adaptive zeroth-order optimization method that leverages variance reduction techniques to improve convergence speed and stability in black-box and resource-constrained scenarios.

Contribution

It provides the first variance reduction analysis for first moment estimates in ZO optimization and develops a variance-aware convergence framework for adaptive ZO methods.

Findings

01

R-AdaZO achieves faster convergence than ZO-AdaMM.

02

Theoretical analysis confirms variance reduction benefits.

03

Experiments demonstrate improved performance in black-box attacks and LLM fine-tuning.

Abstract

Recently, zeroth-order (ZO) optimization plays an essential role in scenarios where gradient information is inaccessible or unaffordable, such as black-box systems and resource-constrained environments. While existing adaptive methods such as ZO-AdaMM have shown promise, they are fundamentally limited by their underutilization of moment information during optimization, usually resulting in underperforming convergence. To overcome these limitations, this paper introduces Refined Adaptive Zeroth-Order Optimization (R-AdaZO). Specifically, we first show the untapped variance reduction effect of first moment estimate on ZO gradient estimation, which improves the accuracy and stability of ZO updates. We then refine the second moment estimate based on these variance-reduced gradient estimates to better capture the geometry of the optimization landscape, enabling a more effective scaling of ZO…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIterative Methods for Nonlinear Equations