Combinatorial Stochastic-Greedy Bandit

Fares Fourati; Christopher John Quinn; Mohamed-Slim Alouini; Vaneet; Aggarwal

arXiv:2312.08057·cs.LG·December 14, 2023·1 cites

Combinatorial Stochastic-Greedy Bandit

Fares Fourati, Christopher John Quinn, Mohamed-Slim Alouini, Vaneet, Aggarwal

PDF

Open Access

TL;DR

This paper introduces a new combinatorial stochastic-greedy bandit algorithm that efficiently balances exploration and exploitation, achieving improved regret bounds and superior empirical performance in social influence maximization tasks.

Contribution

The paper presents a novel SGB algorithm with optimized sampling and proven regret bounds, outperforming existing methods for large-scale combinatorial bandit problems.

Findings

01

Achieves a $(1-1/e)$-regret bound of $ ilde{O}(n^{1/3} k^{2/3} T^{2/3})$ for monotone submodular rewards.

02

Outperforms state-of-the-art algorithms in online social influence maximization.

03

Demonstrates increased performance gap as the cardinality constraint $k$ grows.

Abstract

We propose a novel combinatorial stochastic-greedy bandit (SGB) algorithm for combinatorial multi-armed bandit problems when no extra information other than the joint reward of the selected set of $n$ arms at each time step $t \in [T]$ is observed. SGB adopts an optimized stochastic-explore-then-commit approach and is specifically designed for scenarios with a large set of base arms. Unlike existing methods that explore the entire set of unselected base arms during each selection step, our SGB algorithm samples only an optimized proportion of unselected arms and selects actions from this subset. We prove that our algorithm achieves a $(1 - 1/ e)$ -regret bound of $O (n^{\frac{1}{3}} k^{\frac{2}{3}} T^{\frac{2}{3}} lo g (T)^{\frac{2}{3}})$ for monotone stochastic submodular rewards, which outperforms the state-of-the-art in terms of the cardinality constraint $k$ . Furthermore, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Optimization and Search Problems

MethodsSparse Evolutionary Training · Balanced Selection