Sparse Perturbations for Improved Convergence in Stochastic Zeroth-Order   Optimization

Mayumi Ohta; Nathaniel Berger; Artem Sokolov; Stefan Riezler

arXiv:2006.01759·stat.ML·November 11, 2020·1 cites

Sparse Perturbations for Improved Convergence in Stochastic Zeroth-Order Optimization

Mayumi Ohta, Nathaniel Berger, Artem Sokolov, Stefan Riezler

PDF

Open Access 1 Repo

TL;DR

This paper introduces a sparse stochastic zeroth-order optimization method that improves convergence speed by reducing dependency on the problem's dimensionality, validated through theoretical proof and neural network experiments.

Contribution

The paper proposes a novel sparse SZO method that reduces convergence dependency on dimensionality without assuming sparsity, supported by theoretical proof and empirical results.

Findings

01

Faster convergence in training loss and accuracy on MNIST and CIFAR datasets.

02

Smaller gradient approximation error compared to dense SZO.

03

Theoretical justification for dimensionality reduction in sparse SZO.

Abstract

Interest in stochastic zeroth-order (SZO) methods has recently been revived in black-box optimization scenarios such as adversarial black-box attacks to deep neural networks. SZO methods only require the ability to evaluate the objective function at random input points, however, their weakness is the dependency of their convergence speed on the dimensionality of the function to be evaluated. We present a sparse SZO optimization method that reduces this factor to the expected dimensionality of the random perturbation during learning. We give a proof that justifies this reduction for sparse SZO optimization for non-convex functions without making any assumptions on sparsity of objective function or gradient. Furthermore, we present experimental results for neural networks on MNIST and CIFAR that show faster convergence in training loss and test accuracy, and a smaller distance of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

StatNLP/sparse_szo
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning · Sparse and Compressive Sensing Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings