Nearly Minimax Optimal Regret for Multinomial Logistic Bandit

Joongkyu Lee; Min-hwan Oh

arXiv:2405.09831·stat.ML·October 17, 2025

Nearly Minimax Optimal Regret for Multinomial Logistic Bandit

Joongkyu Lee, Min-hwan Oh

PDF

Open Access 1 Video

TL;DR

This paper establishes nearly minimax optimal regret bounds for the multinomial logit bandit problem, introducing an efficient algorithm that matches these bounds under both uniform and non-uniform rewards.

Contribution

It provides the first proof of minimax optimality in the contextual MNL bandit setting and proposes a computationally efficient algorithm achieving these bounds.

Findings

01

Achieves regret bounds of d ext{T/K} for uniform rewards.

02

Achieves regret bounds of d ext{T} for non-uniform rewards.

03

Introduces OFU-MNL+ algorithm with theoretical guarantees.

Abstract

In this paper, we study the contextual multinomial logit (MNL) bandit problem in which a learning agent sequentially selects an assortment based on contextual information, and user feedback follows an MNL choice model. There has been a significant discrepancy between lower and upper regret bounds, particularly regarding the maximum assortment size $K$ . Additionally, the variation in reward structures between these bounds complicates the quest for optimality. Under uniform rewards, where all items have the same expected reward, we establish a regret lower bound of $Ω (d T / K)$ and propose a constant-time algorithm, OFU-MNL+, that achieves a matching upper bound of $\tilde{O} (d T / K)$ . We also provide instance-dependent minimax regret bounds under uniform rewards. Under non-uniform rewards, we prove a lower bound of $Ω (d T)$ and an upper bound of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Nearly Minimax Optimal Regret for Multinomial Logistic Bandit· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and ELM · Smart Grid Energy Management