The Symmetry between Arms and Knapsacks: A Primal-Dual Approach for Bandits with Knapsacks
Xiaocheng Li, Chunlin Sun, Yinyu Ye

TL;DR
This paper introduces a primal-dual approach to the bandits with knapsacks problem, achieving a novel problem-dependent logarithmic regret bound by exploiting symmetry between arms and knapsacks.
Contribution
It proposes a new primal-dual framework and sub-optimality measure, leading to the first problem-dependent logarithmic regret bound for BwK.
Findings
Achieves logarithmic regret bound dependent on problem parameters.
Identifies symmetry between arms and knapsacks via primal-dual analysis.
Develops a two-phase adaptive algorithm for resource allocation.
Abstract
In this paper, we study the bandits with knapsacks (BwK) problem and develop a primal-dual based algorithm that achieves a problem-dependent logarithmic regret bound. The BwK problem extends the multi-arm bandit (MAB) problem to model the resource consumption associated with playing each arm, and the existing BwK literature has been mainly focused on deriving asymptotically optimal distribution-free regret bounds. We first study the primal and dual linear programs underlying the BwK problem. From this primal-dual perspective, we discover symmetry between arms and knapsacks, and then propose a new notion of sub-optimality measure for the BwK problem. The sub-optimality measure highlights the important role of knapsacks in determining algorithm regret and inspires the design of our two-phase algorithm. In the first phase, the algorithm identifies the optimal arms and the binding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Smart Grid Energy Management
