On Pareto Optimality for Parametric Choice Bandits

Jierui Zuo; Hanzhang Qin

arXiv:2501.19277·stat.ML·April 27, 2026

On Pareto Optimality for Parametric Choice Bandits

Jierui Zuo, Hanzhang Qin

PDF

TL;DR

This paper develops a theoretical framework for online assortment optimization under stochastic choice, balancing revenue performance and inference quality, with explicit regret and error bounds for specific choice models.

Contribution

It introduces a unified OFU-based scheme with regularized likelihood estimators, deriving explicit regret and inference bounds for MNL and other models, and characterizes Pareto-optimal exploration rates.

Findings

01

Regret bound of tilde(n_T + T/\u221a{n_T}) for MNL.

02

Revenue-contrast error of tilde(1/sqrt{n_T}) for MNL.

03

Optimal exploration rate rom T^{2/3} to T^1, balancing regret and inference.

Abstract

We study online assortment optimization under stochastic choice when a decision maker simultaneously values cumulative revenue performance and the quality of post-hoc inference on revenue contrasts. We analyze a forced-exploration optimism-in-the-face-of-uncertainty (OFU) scheme that combines two regularized maximum-likelihood estimators: one based on all observations for sequential decision making, and one based only on exploration rounds for inference. Our general theory is developed under predictable score proxies and per-round action-dependent curvature domination. Under these conditions we establish a self-normalized concentration inequality, a likelihood-based ellipsoidal confidence-set theorem, and a regret bound for approximate optimistic actions that explicitly accounts for optimization error. For the multinomial logit (MNL) model we derive explicit score and curvature proxies…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.