Tractable Multinomial Logit Contextual Bandits with Non-Linear Utilities

Taehyun Hwang; Dahngoon Kim; Min-hwan Oh

arXiv:2601.06913·cs.LG·January 13, 2026

Tractable Multinomial Logit Contextual Bandits with Non-Linear Utilities

Taehyun Hwang, Dahngoon Kim, Min-hwan Oh

PDF

Open Access 1 Video

TL;DR

This paper introduces a computationally efficient algorithm for multinomial logit contextual bandits with non-linear utility functions, including neural networks, achieving near-optimal regret bounds and demonstrating strong empirical performance.

Contribution

It presents the first tractable algorithm for MNL contextual bandits with non-linear utilities that provably attains ilde{O}(\u007F\u007F ext{sqrt}(T)) regret, extending beyond linear utility models.

Findings

01

Achieves ilde{O}(\u007F ext ext{sqrt}(T)) regret bound.

02

Effective in both realizable and misspecified scenarios.

03

First computationally tractable method with provable guarantees for non-linear utilities.

Abstract

We study the multinomial logit (MNL) contextual bandit problem for sequential assortment selection. Although most existing research assumes utility functions to be linear in item features, this linearity assumption restricts the modeling of intricate interactions between items and user preferences. A recent work (Zhang & Luo, 2024) has investigated general utility function classes, yet its method faces fundamental trade-offs between computational tractability and statistical efficiency. To address this limitation, we propose a computationally efficient algorithm for MNL contextual bandits leveraging the upper confidence bound principle, specifically designed for non-linear parametric utility functions, including those modeled by neural networks. Under a realizability assumption and a mild geometric condition on the utility function class, our algorithm achieves a regret bound of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Tractable Multinomial Logit Contextual Bandits with Non-Linear Utilities· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Recommender Systems and Techniques