Online learning with feedback graphs and switching costs
Anshuka Rangi, Massimo Franceschetti

TL;DR
This paper investigates online learning with feedback graphs and switching costs, establishing regret lower bounds, highlighting the limitations of existing algorithms, and proposing new algorithms that are order optimal in various settings.
Contribution
It provides the first lower bounds for general feedback graphs with switching costs and introduces two new algorithms, Threshold Based EXP3 and EXP3. SC, that are order optimal in key scenarios.
Findings
Threshold Based EXP3 outperforms previous algorithms with switching costs.
Both proposed algorithms are order optimal in symmetric PI and MAB settings.
Threshold Based EXP3 is order optimal in switching costs.
Abstract
We study online learning when partial feedback information is provided following every action of the learning process, and the learner incurs switching costs for changing his actions. In this setting, the feedback information system can be represented by a graph, and previous works studied the expected regret of the learner in the case of a clique (Expert setup), or disconnected single loops (Multi-Armed Bandits (MAB)). This work provides a lower bound on the expected regret in the Partial Information (PI) setting, namely for general feedback graphs --excluding the clique. Additionally, it shows that all algorithms that are optimal without switching costs are necessarily sub-optimal in the presence of switching costs, which motivates the need to design new algorithms. We propose two new algorithms: Threshold Based EXP3 and EXP3. SC. For the two special cases of symmetric PI setting and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems
