Non-stationary Bandits with Knapsacks

Shang Liu; Jiashuo Jiang; Xiaocheng Li

arXiv:2205.12427·cs.LG·October 13, 2022·6 cites

Non-stationary Bandits with Knapsacks

Shang Liu, Jiashuo Jiang, Xiaocheng Li

PDF

Open Access 1 Video

TL;DR

This paper investigates non-stationary bandits with knapsacks, introducing a new non-stationarity measure and deriving bounds that account for resource constraints and environment changes.

Contribution

It proposes a novel global non-stationarity measure for BwK, extending analysis to non-stationary environments and online convex optimization with constraints.

Findings

01

Established upper and lower bounds for non-stationary BwK.

02

Introduced a new non-stationarity measure suitable for constrained settings.

03

Extended analysis to online convex optimization with constraints.

Abstract

In this paper, we study the problem of bandits with knapsacks (BwK) in a non-stationary environment. The BwK problem generalizes the multi-arm bandit (MAB) problem to model the resource consumption associated with playing each arm. At each time, the decision maker/player chooses to play an arm, and s/he will receive a reward and consume certain amount of resource from each of the multiple resource types. The objective is to maximize the cumulative reward over a finite horizon subject to some knapsack constraints on the resources. Existing works study the BwK problem under either a stochastic or adversarial environment. Our paper considers a non-stationary environment which continuously interpolates between these two extremes. We first show that the traditional notion of variation budget is insufficient to characterize the non-stationarity of the BwK problem for a sublinear regret due to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Non-stationary Bandits with Knapsacks· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Auction Theory and Applications