A Tractable Online Learning Algorithm for the Multinomial Logit Contextual Bandit
Priyank Agrawal, Theja Tulabandhula, Vashist Avadhanula

TL;DR
This paper introduces a new online learning algorithm for the MNL-Contextual Bandit problem that achieves better regret bounds and is computationally tractable, improving decision-making in dynamic assortment optimization.
Contribution
It proposes an optimistic algorithm with a convex relaxation for the MNL-Contextual Bandit problem, reducing regret bounds and computational complexity.
Findings
Regret bounded by O(√dT + κ), an improvement over previous bounds.
Convex relaxation enables tractable decision-making.
Algorithm performs well in dynamic assortment optimization scenarios.
Abstract
In this paper, we consider the contextual variant of the MNL-Bandit problem. More specifically, we consider a dynamic set optimization problem, where a decision-maker offers a subset (assortment) of products to a consumer and observes the response in every round. Consumers purchase products to maximize their utility. We assume that a set of attributes describe the products, and the mean utility of a product is linear in the values of these attributes. We model consumer choice behavior using the widely used Multinomial Logit (MNL) model and consider the decision maker problem of dynamically learning the model parameters while optimizing cumulative revenue over the selling horizon . Though this problem has attracted considerable attention in recent times, many existing methods often involve solving an intractable non-convex optimization problem. Their theoretical performance guarantees…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Auction Theory and Applications
