Online Learning with Vector Costs and Bandits with Knapsacks

Thomas Kesselheim; Sahil Singla

arXiv:2010.07346·cs.LG·October 19, 2020·5 cites

Online Learning with Vector Costs and Bandits with Knapsacks

Thomas Kesselheim, Sahil Singla

PDF

Open Access

TL;DR

This paper introduces a new online learning framework with vector costs, providing algorithms that achieve near-optimal regret and competitive ratios in both stochastic and adversarial settings, with applications to Bandits with Knapsacks.

Contribution

It develops a reduction from multi-dimensional to single-dimensional online learning, enabling the use of classical algorithms for vector cost problems and improving results for Bandits with Knapsacks.

Findings

01

Achieves sublinear regret in stochastic settings.

02

Provides a tight $O( ext{min}igrace p, ext{log} digrace)$ competitive ratio for adversarial arrivals.

03

Improves the competitive ratio for adversarial Bandits with Knapsacks to $O( ext{log} d imes ext{log} T)$.

Abstract

We introduce online learning with vector costs (\OLVCp) where in each time step $t \in {1, \dots, T}$ , we need to play an action $i \in {1, \dots, n}$ that incurs an unknown vector cost in $[0, 1]^{d}$ . The goal of the online algorithm is to minimize the $ℓ_{p}$ norm of the sum of its cost vectors. This captures the classical online learning setting for $d = 1$ , and is interesting for general $d$ because of applications like online scheduling where we want to balance the load between different machines (dimensions). We study \OLVCp in both stochastic and adversarial arrival settings, and give a general procedure to reduce the problem from $d$ dimensions to a single dimension. This allows us to use classical online learning algorithms in both full and bandit feedback models to obtain (near) optimal results. In particular, we obtain a single algorithm (up to the choice of learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms