A Tractable Online Learning Algorithm for the Multinomial Logit   Contextual Bandit

Priyank Agrawal; Theja Tulabandhula; Vashist Avadhanula

arXiv:2011.14033·cs.LG·April 16, 2024·1 cites

A Tractable Online Learning Algorithm for the Multinomial Logit Contextual Bandit

Priyank Agrawal, Theja Tulabandhula, Vashist Avadhanula

PDF

Open Access

TL;DR

This paper introduces a new online learning algorithm for the MNL-Contextual Bandit problem that achieves better regret bounds and is computationally tractable, improving decision-making in dynamic assortment optimization.

Contribution

It proposes an optimistic algorithm with a convex relaxation for the MNL-Contextual Bandit problem, reducing regret bounds and computational complexity.

Findings

01

Regret bounded by O(√dT + κ), an improvement over previous bounds.

02

Convex relaxation enables tractable decision-making.

03

Algorithm performs well in dynamic assortment optimization scenarios.

Abstract

In this paper, we consider the contextual variant of the MNL-Bandit problem. More specifically, we consider a dynamic set optimization problem, where a decision-maker offers a subset (assortment) of products to a consumer and observes the response in every round. Consumers purchase products to maximize their utility. We assume that a set of attributes describe the products, and the mean utility of a product is linear in the values of these attributes. We model consumer choice behavior using the widely used Multinomial Logit (MNL) model and consider the decision maker problem of dynamically learning the model parameters while optimizing cumulative revenue over the selling horizon $T$ . Though this problem has attracted considerable attention in recent times, many existing methods often involve solving an intractable non-convex optimization problem. Their theoretical performance guarantees…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Auction Theory and Applications