Efficient Swap Regret Minimization in Combinatorial Bandits
Andreas Kontogiannis, Vasilis Pollatos, Panayotis Mertikopoulos, Ioannis Panageas

TL;DR
This paper introduces a new no-swap regret algorithm for combinatorial bandits that achieves polylogarithmic dependence on the number of actions, making it computationally efficient for large-scale problems.
Contribution
The paper presents the first efficient no-swap regret algorithm with polylogarithmic dependence on the action space size in combinatorial bandits, and demonstrates its practical implementation.
Findings
Achieves polylogarithmic swap regret in combinatorial bandits.
Provides an efficient implementation with polylogarithmic per-iteration complexity.
Proves tightness of the regret bounds for the proposed algorithm.
Abstract
This paper addresses the problem of designing efficient no-swap regret algorithms for combinatorial bandits, where the number of actions is exponentially large in the dimensionality of the problem. In this setting, designing efficient no-swap regret translates to sublinear -- in horizon -- swap regret with polylogarithmic dependence on . In contrast to the weaker notion of external regret minimization - a problem which is fairly well understood in the literature - achieving no-swap regret with a polylogarithmic dependence on has remained elusive in combinatorial bandits. Our paper resolves this challenge, by introducing a no-swap-regret learning algorithm with regret that scales polylogarithmically in and is tight for the class of combinatorial bandits. To ground our results, we also demonstrate how to implement the proposed algorithm efficiently -- that is, with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Recommender Systems and Techniques · Stochastic Gradient Optimization Techniques
