Derivative-Free & Order-Robust Optimisation

Victor Gabillon; Rasul Tutunov; Michal Valko; Haitham Bou Ammar

arXiv:1910.04034·cs.LG·October 23, 2019

Derivative-Free & Order-Robust Optimisation

Victor Gabillon, Rasul Tutunov, Michal Valko, Haitham Bou Ammar

PDF

Open Access

TL;DR

This paper introduces Vroom, a zero'th order optimization algorithm designed for non-stationary and adversarial environments, achieving vanishing regret and addressing a rarely explored aspect of simple regret in online learning.

Contribution

It formalizes order-robust optimization as online learning and presents Vroom, the first algorithm targeting simple regret in adversarial settings with proven performance.

Findings

01

Vroom achieves vanishing regret in non-stationary environments.

02

It recovers favorable rates under stochastic reward processes.

03

Addresses a novel challenge in simple regret for adversarial scenarios.

Abstract

In this paper, we formalise order-robust optimisation as an instance of online learning minimising simple regret, and propose Vroom, a zero'th order optimisation algorithm capable of achieving vanishing regret in non-stationary environments, while recovering favorable rates under stochastic reward-generating processes. Our results are the first to target simple regret definitions in adversarial scenarios unveiling a challenge that has been rarely considered in prior work.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization