Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations
Shaocong Ma, Heng Huang

TL;DR
This paper investigates optimal random perturbation distributions for two-point zeroth-order gradient estimators, revealing that directionally aligned perturbations can reduce variance and improve optimization accuracy.
Contribution
It introduces the concept of directionally aligned perturbations (DAP), optimizing perturbation distributions for lower variance and higher accuracy in zeroth-order optimization.
Findings
DAP schemes outperform traditional fixed-length perturbations
Theoretical analysis extends convergence bounds for $oldsymbol{ extdelta}$-unbiased perturbations
Empirical results show improved performance on synthetic and practical tasks
Abstract
In this paper, we explore the two-point zeroth-order gradient estimator and identify the distribution of random perturbations that minimizes the estimator's asymptotic variance as the perturbation stepsize tends to zero. We formulate it as a constrained functional optimization problem over the space of perturbation distributions. Our findings reveal that such desired perturbations can align directionally with the true gradient, instead of maintaining a fixed length. While existing research has largely focused on fixed-length perturbations, the potential advantages of directional alignment have been overlooked. To address this gap, we delve into the theoretical and empirical properties of the directionally aligned perturbation (DAP) scheme, which adaptively offers higher accuracy along critical directions. Additionally, we provide a convergence analysis for stochastic gradient descent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Model Reduction and Neural Networks · Advanced Optimization Algorithms Research
