Defensive Universal Learning with Experts

Jan Poland; Marcus Hutter

arXiv:cs/0507044·cs.LG·May 23, 2007

Defensive Universal Learning with Experts

Jan Poland, Marcus Hutter

PDF

Open Access

TL;DR

This paper develops a universal learning algorithm that leverages expert advice in a bandit setting, capable of handling infinite expert classes and adversarial losses, and performs nearly as well as any computable strategy.

Contribution

It introduces a new experts algorithm for bandit feedback with infinite experts and adaptive adversaries, achieving universal learning performance.

Findings

01

Achieves loss bounds against adaptive adversaries.

02

Handles countably infinite expert classes.

03

Performs nearly as well as any computable strategy.

Abstract

This paper shows how universal learning can be achieved with expert advice. To this aim, we specify an experts algorithm with the following characteristics: (a) it uses only feedback from the actions actually chosen (bandit setup), (b) it can be applied with countably infinite expert classes, and (c) it copes with losses that may grow in time appropriately slowly. We prove loss bounds against an adaptive adversary. From this, we obtain a master algorithm for "reactive" experts problems, which means that the master's actions may influence the behavior of the adversary. Our algorithm can significantly outperform standard experts algorithms on such problems. Finally, we combine it with a universal expert class. The resulting universal learner performs -- in a certain sense -- almost as well as any computable strategy, for any online decision problem. We also specify the (worst-case)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Mobile Crowdsensing and Crowdsourcing