Improving Regret Bounds for Combinatorial Semi-Bandits with   Probabilistically Triggered Arms and Its Applications

Qinshi Wang; Wei Chen

arXiv:1703.01610·cs.LG·June 9, 2021·35 cites

Improving Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms and Its Applications

Qinshi Wang, Wei Chen

PDF

Open Access

TL;DR

This paper improves regret bounds for combinatorial semi-bandits with probabilistically triggered arms by removing an exponential dependence on trigger probabilities, using a new TPM smoothness condition applicable to several applications.

Contribution

The authors introduce the TPM bounded smoothness condition, eliminating the $1/p^*$ factor in regret bounds for CMAB-T problems, and establish its necessity through lower bounds.

Findings

01

Regret bounds are significantly improved for influence maximization and cascading bandits.

02

The TPM condition is satisfied by many practical applications.

03

Lower bounds show the $1/p^*$ factor is unavoidable without TPM.

Abstract

We study combinatorial multi-armed bandit with probabilistically triggered arms (CMAB-T) and semi-bandit feedback. We resolve a serious issue in the prior CMAB-T studies where the regret bounds contain a possibly exponentially large factor of $1/ p^{*}$ , where $p^{*}$ is the minimum positive probability that an arm is triggered by any action. We address this issue by introducing a triggering probability modulated (TPM) bounded smoothness condition into the general CMAB-T framework, and show that many applications such as influence maximization bandit and combinatorial cascading bandit satisfy this TPM condition. As a result, we completely remove the factor of $1/ p^{*}$ from the regret bounds, achieving significantly better regret bounds for influence maximization and cascading bandits than before. Finally, we provide lower bound results showing that the factor $1/ p^{*}$ is unavoidable for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Optimization and Search Problems