Thompson Sampling for Repeated Newsvendor

Li Chen; Hanzhang Qin; Yunbei Xu; Ruihao Zhu; Weizhou Zhang

arXiv:2502.09900·cs.LG·January 19, 2026

Thompson Sampling for Repeated Newsvendor

Li Chen, Hanzhang Qin, Yunbei Xu, Ruihao Zhu, Weizhou Zhang

PDF

Open Access

TL;DR

This paper analyzes the effectiveness of Thompson Sampling in the repeated newsvendor problem, providing regret bounds, insights into exploration-exploitation, and demonstrating superior performance over existing methods through simulations.

Contribution

It offers the first regret analysis of Thompson Sampling for the repeated newsvendor model, extending to general parametric distributions and providing practical insights.

Findings

01

Thompson Sampling achieves near-optimal regret bounds.

02

TS automatically balances exploration and exploitation based on order size.

03

Numerical results show TS outperforms existing approaches.

Abstract

In this paper, we investigate the performance of Thompson Sampling (TS) for online learning with censored feedback, focusing primarily on the classic repeated newsvendor model--a foundational framework in inventory management--and demonstrating how our techniques can be naturally extended to a broader class of problems. We first model demand using a Weibull distribution and initialize TS with a Gamma prior to dynamically adjust order quantities. Our analysis establishes optimal (up to logarithmic factors) frequentist regret bounds for TS without imposing restrictive prior assumptions. More importantly, it yields novel and highly interpretable insights on how TS addresses the exploration-exploitation trade-off in the repeated newsvendor setting. Specifically, our results show that when past order quantities are sufficiently large to overcome censoring, TS accurately estimates the unknown…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsConsumer Market Behavior and Pricing · Supply Chain and Inventory Management · Advanced Bandit Algorithms Research