Multi-Point Bandit Algorithms for Nonstationary Online Nonconvex Optimization
Abhishek Roy, Krishnakumar Balasubramanian, Saeed Ghadimi, Prasant, Mohapatra

TL;DR
This paper develops and analyzes bandit algorithms for nonstationary online nonconvex optimization, introducing novel regret measures based on stationary solutions and Hessian estimation, with applications to avoiding saddle points and structured nonconvex functions.
Contribution
It proposes new bandit algorithms for nonstationary nonconvex problems, including Hessian estimation via Gaussian Stein's identity and regret bounds for structured nonconvex functions.
Findings
Regret bounds for nonstationary nonconvex optimization using stationary solutions.
Hessian estimation in bandit setting via Gaussian Stein's identity.
Algorithms for avoiding saddle points in nonstationary bandit problems.
Abstract
Bandit algorithms have been predominantly analyzed in the convex setting with function-value based stationary regret as the performance measure. In this paper, motivated by online reinforcement learning problems, we propose and analyze bandit algorithms for both general and structured nonconvex problems with nonstationary (or dynamic) regret as the performance measure, in both stochastic and non-stochastic settings. First, for general nonconvex functions, we consider nonstationary versions of first-order and second-order stationary solutions as a regret measure, motivated by similar performance measures for offline nonconvex optimization. In the case of second-order stationary solution based regret, we propose and analyze online and bandit versions of the cubic regularized Newton's method. The bandit version is based on estimating the Hessian matrices in the bandit setting, based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Advanced Wireless Network Optimization
