Sleeping Combinatorial Bandits

Kumar Abhishek; Ganesh Ghalme; Sujit Gujar; Yadati Narahari

arXiv:2106.01624·cs.LG·June 4, 2021

Sleeping Combinatorial Bandits

Kumar Abhishek, Ganesh Ghalme, Sujit Gujar, Yadati Narahari

PDF

Open Access

TL;DR

This paper introduces a new algorithm, extCSUCB, for sleeping combinatorial bandits, achieving logarithmic and sublinear regret bounds, and validates its effectiveness through theoretical analysis and experiments.

Contribution

It adapts the CUCB algorithm to sleeping combinatorial bandits and provides regret guarantees under general conditions, extending prior work.

Findings

01

Achieves $O( extlog T)$ regret in certain settings.

02

Attains $O( oot 3 extlog T^2)$ regret in the general case.

03

Validates theoretical results with experiments.

Abstract

In this paper, we study an interesting combination of sleeping and combinatorial stochastic bandits. In the mixed model studied here, at each discrete time instant, an arbitrary \emph{availability set} is generated from a fixed set of \emph{base} arms. An algorithm can select a subset of arms from the \emph{availability set} (sleeping bandits) and receive the corresponding reward along with semi-bandit feedback (combinatorial bandits). We adapt the well-known CUCB algorithm in the sleeping combinatorial bandits setting and refer to it as \CSUCB. We prove -- under mild smoothness conditions -- that the \CSUCB\ algorithm achieves an $O (lo g (T))$ instance-dependent regret guarantee. We further prove that (i) when the range of the rewards is bounded, the regret guarantee of \CSUCB\ algorithm is $O (T lo g (T))$ and (ii) the instance-independent regret is $O (3 T^{2} lo g (T))$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Smart Grid Energy Management