Improved Algorithms for Multi-period Multi-class Packing Problems with Bandit Feedback
Wonyoung Kim, Garud Iyengar, Assaf Zeevi

TL;DR
This paper introduces improved algorithms for multi-period, multi-class packing problems with bandit feedback, achieving faster convergence and lower regret, and extends existing results to a multi-class setting with practical numerical validation.
Contribution
The paper develops a new estimator with faster convergence, proposes a closed-form bandit policy, and extends regret bounds to multi-class problems, addressing an open problem in the field.
Findings
Proposed estimator guarantees faster convergence rates.
The bandit policy achieves sublinear regret in key parameters.
Numerical experiments show superior performance over benchmarks.
Abstract
We consider the linear contextual multi-class multi-period packing problem (LMMP) where the goal is to pack items such that the total vector of consumption is below a given budget vector and the total value is as large as possible. We consider the setting where the reward and the consumption vector associated with each action is a class-dependent linear function of the context, and the decision-maker receives bandit feedback. LMMP includes linear contextual bandits with knapsacks and online revenue management as special cases. We establish a new estimator which guarantees a faster convergence rate, and consequently, a lower regret in such problems. We propose a bandit policy that is a closed-form function of said estimated parameters. When the contexts are non-degenerate, the regret of the proposed policy is sublinear in the context dimension, the number of classes, and the time horizon…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Supply Chain and Inventory Management · Smart Parking Systems Research
