Online Learning with Vector Costs and Bandits with Knapsacks
Thomas Kesselheim, Sahil Singla

TL;DR
This paper introduces a new online learning framework with vector costs, providing algorithms that achieve near-optimal regret and competitive ratios in both stochastic and adversarial settings, with applications to Bandits with Knapsacks.
Contribution
It develops a reduction from multi-dimensional to single-dimensional online learning, enabling the use of classical algorithms for vector cost problems and improving results for Bandits with Knapsacks.
Findings
Achieves sublinear regret in stochastic settings.
Provides a tight $O( ext{min}igrace p, ext{log} digrace)$ competitive ratio for adversarial arrivals.
Improves the competitive ratio for adversarial Bandits with Knapsacks to $O( ext{log} d imes ext{log} T)$.
Abstract
We introduce online learning with vector costs (\OLVCp) where in each time step , we need to play an action that incurs an unknown vector cost in . The goal of the online algorithm is to minimize the norm of the sum of its cost vectors. This captures the classical online learning setting for , and is interesting for general because of applications like online scheduling where we want to balance the load between different machines (dimensions). We study \OLVCp in both stochastic and adversarial arrival settings, and give a general procedure to reduce the problem from dimensions to a single dimension. This allows us to use classical online learning algorithms in both full and bandit feedback models to obtain (near) optimal results. In particular, we obtain a single algorithm (up to the choice of learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms
