UCB-based Algorithms for Multinomial Logistic Regression Bandits

Sanae Amani; Christos Thrampoulidis

arXiv:2103.11489·cs.LG·March 23, 2021·1 cites

UCB-based Algorithms for Multinomial Logistic Regression Bandits

Sanae Amani, Christos Thrampoulidis

PDF

Open Access 1 Video

TL;DR

This paper extends logistic bandit algorithms to multinomial outcomes, proposing MNL-UCB, which effectively maximizes revenue with theoretical regret bounds and practical performance in multi-outcome scenarios.

Contribution

It introduces MNL-UCB, a novel UCB-based algorithm for multinomial logistic bandits, providing the first regret guarantees for this setting.

Findings

01

MNL-UCB achieves regret of (dKT) in theory.

02

Numerical simulations confirm the effectiveness of MNL-UCB.

03

The approach handles multiple outcomes beyond binary rewards.

Abstract

Out of the rich family of generalized linear bandits, perhaps the most well studied ones are logisitc bandits that are used in problems with binary rewards: for instance, when the learner/agent tries to maximize the profit over a user that can select one of two possible outcomes (e.g., `click' vs `no-click'). Despite remarkable recent progress and improved algorithms for logistic bandits, existing works do not address practical situations where the number of outcomes that can be selected by the user is larger than two (e.g., `click', `show me later', `never show again', `no click'). In this paper, we study such an extension. We use multinomial logit (MNL) to model the probability of each one of $K + 1 \geq 2$ possible outcomes (+1 stands for the `not click' outcome): we assume that for a learner's action $x_{t}$ , the user selects one of $K + 1 \geq 2$ outcomes, say outcome $i$ , with a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

UCB-based Algorithms for Multinomial Logistic Regression Bandits· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Smart Grid Energy Management