Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization
Sijia Liu, Bhavya Kailkhura, Pin-Yu Chen, Paishun Ting and, Shiyu Chang, Lisa Amini

TL;DR
This paper introduces ZO-SVRG, a variance reduced zeroth-order optimization algorithm with theoretical analysis and practical applications, demonstrating improved convergence and efficiency in black-box problems.
Contribution
It presents a novel variance reduced zeroth-order algorithm, ZO-SVRG, with theoretical insights and accelerated variants, advancing zeroth-order optimization methods.
Findings
ZO-SVRG outperforms existing ZO algorithms in convergence speed
Accelerated ZO-SVRG achieves the best known iteration rate for ZO stochastic optimization
Experimental results validate the effectiveness of the proposed methods in real-world applications
Abstract
As application demands for zeroth-order (gradient-free) optimization accelerate, the need for variance reduced and faster converging approaches is also intensifying. This paper addresses these challenges by presenting: a) a comprehensive theoretical analysis of variance reduced zeroth-order (ZO) optimization, b) a novel variance reduced ZO algorithm, called ZO-SVRG, and c) an experimental evaluation of our approach in the context of two compelling applications, black-box chemical material classification and generation of adversarial examples from black-box deep neural network models. Our theoretical analysis uncovers an essential difficulty in the analysis of ZO-SVRG: the unbiased assumption on gradient estimates no longer holds. We prove that compared to its first-order counterpart, ZO-SVRG with a two-point random gradient estimator could suffer an additional error of order ,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Machine Learning and ELM
