MNL-Bandits under Inventory and Limited Switches Constraints
Hongbin Zhang, Yu Yang, Feng Wu, Qixin Zhang

TL;DR
This paper addresses the challenge of optimizing product assortments under inventory and limited switching constraints using a UCB-like algorithm, demonstrating sub-linear regret bounds and superior empirical performance.
Contribution
It introduces a novel algorithm for assortment optimization under practical constraints, with proven regret bounds and improved empirical results.
Findings
The proposed algorithm achieves sub-linear regret bounds.
It outperforms baseline methods in numerical experiments.
The regret bound is nearly optimal with respect to the time horizon.
Abstract
Optimizing the assortment of products to display to customers is a key to increasing revenue for both offline and online retailers. To trade-off between exploring customers' preference and exploiting customers' choices learned from data, in this paper, by adopting the Multi-Nomial Logit (MNL) choice model to capture customers' choices over products, we study the problem of optimizing assortments over a planning horizon for maximizing the profit of the retailer. To make the problem setting more practical, we consider both the inventory constraint and the limited switches constraint, where the retailer cannot use up the resource inventory before time and is forbidden to switch the assortment shown to customers too many times. Such a setting suits the case when an online retailer wants to dynamically optimize the assortment selection for a population of customers. We develop an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Supply Chain and Inventory Management · Optimization and Search Problems
