Online Optimization : Competing with Dynamic Comparators

Ali Jadbabaie; Alexander Rakhlin; Shahin Shahrampour; Karthik; Sridharan

arXiv:1501.06225·cs.LG·January 27, 2015·93 cites

Online Optimization : Competing with Dynamic Comparators

Ali Jadbabaie, Alexander Rakhlin, Shahin Shahrampour, Karthik, Sridharan

PDF

Open Access

TL;DR

This paper introduces an adaptive online learning algorithm that effectively competes with dynamic benchmarks, with regret bounds that adapt to the environment's complexity, and applies to drifting zero-sum games.

Contribution

The paper develops a fully adaptive method for online learning that competes with dynamic benchmarks and adapts to the environment's regularity, extending to drifting game scenarios.

Findings

01

Regret bounds scale with the regularity of the sequence.

02

Algorithm performs well against complex, changing benchmarks.

03

Achieves no regret in drifting zero-sum games.

Abstract

Recent literature on online learning has focused on developing adaptive algorithms that take advantage of a regularity of the sequence of observations, yet retain worst-case performance guarantees. A complementary direction is to develop prediction methods that perform well against complex benchmarks. In this paper, we address these two directions together. We present a fully adaptive method that competes with dynamic benchmarks in which regret guarantee scales with regularity of the sequence of cost functions and comparators. Notably, the regret bound adapts to the smaller complexity measure in the problem environment. Finally, we apply our results to drifting zero-sum, two-player games where both players achieve no regret guarantees against best sequences of actions in hindsight.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Optimization and Search Problems