Online learning with feedback graphs and switching costs

Anshuka Rangi; Massimo Franceschetti

arXiv:1810.09666·cs.LG·May 21, 2019·5 cites

Online learning with feedback graphs and switching costs

Anshuka Rangi, Massimo Franceschetti

PDF

Open Access

TL;DR

This paper investigates online learning with feedback graphs and switching costs, establishing regret lower bounds, highlighting the limitations of existing algorithms, and proposing new algorithms that are order optimal in various settings.

Contribution

It provides the first lower bounds for general feedback graphs with switching costs and introduces two new algorithms, Threshold Based EXP3 and EXP3. SC, that are order optimal in key scenarios.

Findings

01

Threshold Based EXP3 outperforms previous algorithms with switching costs.

02

Both proposed algorithms are order optimal in symmetric PI and MAB settings.

03

Threshold Based EXP3 is order optimal in switching costs.

Abstract

We study online learning when partial feedback information is provided following every action of the learning process, and the learner incurs switching costs for changing his actions. In this setting, the feedback information system can be represented by a graph, and previous works studied the expected regret of the learner in the case of a clique (Expert setup), or disconnected single loops (Multi-Armed Bandits (MAB)). This work provides a lower bound on the expected regret in the Partial Information (PI) setting, namely for general feedback graphs --excluding the clique. Additionally, it shows that all algorithms that are optimal without switching costs are necessarily sub-optimal in the presence of switching costs, which motivates the need to design new algorithms. We propose two new algorithms: Threshold Based EXP3 and EXP3. SC. For the two special cases of symmetric PI setting and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems