Learning to Use Learners' Advice

Adish Singla; Hamed Hassani; Andreas Krause

arXiv:1702.04825·cs.LG·February 21, 2017·2 cites

Learning to Use Learners' Advice

Adish Singla, Hamed Hassani, Andreas Krause

PDF

Open Access

TL;DR

This paper introduces a new online learning framework where experts learn from limited feedback, and proposes an algorithm that achieves sublinear regret by guiding expert feedback, addressing the challenge of no-regret learning without coordination.

Contribution

The paper models experts as learning entities with limited feedback and develops a novel algorithm that guides feedback to achieve no-regret guarantees.

Findings

01

Proves the impossibility of no-regret algorithms without coordination.

02

Designs a feedback-guided algorithm achieving regret of O(T^{1/(2 - egretRate)}).

03

Demonstrates the effectiveness of guiding expert feedback in online learning.

Abstract

In this paper, we study a variant of the framework of online learning using expert advice with limited/bandit feedback. We consider each expert as a learning entity, seeking to more accurately reflecting certain real-world applications. In our setting, the feedback at any time $t$ is limited in a sense that it is only available to the expert $i^{t}$ that has been selected by the central algorithm (forecaster), \emph{i.e.}, only the expert $i^{t}$ receives feedback from the environment and gets to learn at time $t$ . We consider a generic black-box approach whereby the forecaster does not control or know the learning dynamics of the experts apart from knowing the following no-regret learning property: the average regret of any expert $j$ vanishes at a rate of at least $O (t_{j}^{\regretRate - 1})$ with $t_{j}$ learning steps where $\regretRate \in [0, 1]$ is a parameter. In the spirit of competing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics