A New One-Point Residual-Feedback Oracle For Black-Box Learning and   Control

Yan Zhang; Yi Zhou; Kaiyi Ji; Michael M. Zavlanos

arXiv:2006.10820·math.OC·September 9, 2021·5 cites

A New One-Point Residual-Feedback Oracle For Black-Box Learning and Control

Yan Zhang, Yi Zhou, Kaiyi Ji, Michael M. Zavlanos

PDF

Open Access

TL;DR

This paper introduces a novel one-point residual-feedback scheme for zeroth-order optimization that achieves comparable convergence rates to two-point schemes while requiring only a single function evaluation per iteration, making it practical for online applications.

Contribution

The paper proposes a new one-point residual-feedback method for black-box optimization that matches the efficiency of two-point schemes and improves query complexity under certain conditions.

Findings

01

Query complexity matches two-point schemes for deterministic functions.

02

Achieves same convergence rate as two-point schemes in stochastic bandit problems.

03

Effective in practical experiments demonstrating comparable performance.

Abstract

Zeroth-order optimization (ZO) algorithms have been recently used to solve black-box or simulation-based learning and control problems, where the gradient of the objective function cannot be easily computed but can be approximated using the objective function values. Many existing ZO algorithms adopt two-point feedback schemes due to their fast convergence rate compared to one-point feedback schemes. However, two-point schemes require two evaluations of the objective function at each iteration, which can be impractical in applications where the data are not all available a priori, e.g., in online optimization. In this paper, we propose a novel one-point feedback scheme that queries the function value once at each iteration and estimates the gradient using the residual between two consecutive points. When optimizing a deterministic Lipschitz function, we show that the query complexity of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms