ZO-AdaMM: Zeroth-Order Adaptive Momentum Method for Black-Box   Optimization

Xiangyi Chen; Sijia Liu; Kaidi Xu; Xingguo Li; Xue Lin; Mingyi Hong,; David Cox

arXiv:1910.06513·cs.LG·October 17, 2019·34 cites

ZO-AdaMM: Zeroth-Order Adaptive Momentum Method for Black-Box Optimization

Xiangyi Chen, Sijia Liu, Kaidi Xu, Xingguo Li, Xue Lin, Mingyi Hong,, David Cox

PDF

Open Access 1 Repo

TL;DR

This paper introduces ZO-AdaMM, a zeroth-order adaptive momentum method for black-box optimization, demonstrating faster convergence than existing methods and applications in adversarial attacks.

Contribution

We propose ZO-AdaMM, extending AdaMM to gradient-free settings, analyze its convergence, and apply it to black-box neural network attacks.

Findings

01

ZO-AdaMM converges faster than 6 state-of-the-art ZO methods on ImageNet.

02

Convergence rate is roughly O(√d) worse than first-order AdaMM.

03

Mahalanobis distance is crucial for convergence analysis.

Abstract

The adaptive momentum method (AdaMM), which uses past gradients to update descent directions and learning rates simultaneously, has become one of the most popular first-order optimization methods for solving machine learning problems. However, AdaMM is not suited for solving black-box optimization problems, where explicit gradient forms are difficult or infeasible to obtain. In this paper, we propose a zeroth-order AdaMM (ZO-AdaMM) algorithm, that generalizes AdaMM to the gradient-free regime. We show that the convergence rate of ZO-AdaMM for both convex and nonconvex optimization is roughly a factor of $O (d)$ worse than that of the first-order AdaMM algorithm, where $d$ is problem size. In particular, we provide a deep understanding on why Mahalanobis distance matters in convergence of ZO-AdaMM and other AdaMM-type methods. As a byproduct, our analysis makes the first step…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

KaidiXu/ZO-AdaMM
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and ELM