Combinatorial Logistic Bandits

Xutong Liu; Xiangxiang Dai; Xuchuang Wang; Mohammad Hajiesmaili; John C.S. Lui

arXiv:2410.17075·cs.LG·May 15, 2025

Combinatorial Logistic Bandits

Xutong Liu, Xiangxiang Dai, Xuchuang Wang, Mohammad Hajiesmaili, John C.S. Lui

PDF

Open Access 1 Repo

TL;DR

This paper introduces combinatorial logistic bandits, proposing algorithms with improved regret bounds for complex decision-making scenarios involving binary outcomes and logistic models, validated through experiments.

Contribution

The paper develops the CLogUCB and VA-CLogUCB algorithms with enhanced regret bounds and computational efficiency for combinatorial logistic bandits, advancing prior work.

Findings

01

CLogUCB achieves $ ilde{O}(d extstylerac{ ext{nonlinearity}}{ ext{triggering probability}} imes ext{K} imes ext{T})$ regret.

02

VA-CLogUCB attains $ ilde{O}(d extstylerac{ ext{nonlinearity}}{ ext{triggering probability}} imes ext{K} imes ext{T})$ regret, improved to $ ilde{O}(d extstylerac{ ext{nonlinearity}}{ ext{triggering probability}} imes ext{T})$ under stronger conditions.

03

Experiments show our algorithms outperform benchmarks on synthetic and real datasets.

Abstract

We introduce a novel framework called combinatorial logistic bandits (CLogB), where in each round, a subset of base arms (called the super arm) is selected, with the outcome of each base arm being binary and its expectation following a logistic parametric model. The feedback is governed by a general arm triggering process. Our study covers CLogB with reward functions satisfying two smoothness conditions, capturing application scenarios such as online content delivery, online learning to rank, and dynamic channel allocation. We first propose a simple yet efficient algorithm, CLogUCB, utilizing a variance-agnostic exploration bonus. Under the 1-norm triggering probability modulated (TPM) smoothness condition, CLogUCB achieves a regret bound of $\tilde{O} (d κ K T)$ , where $\tilde{O}$ ignores logarithmic factors, $d$ is the dimension of the feature vector, $κ$ represents the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xiangxdai/combinatorial-logistic-bandit
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Sentiment Analysis and Opinion Mining

MethodsBalanced Selection