Contextual Decision-Making with Knapsacks Beyond the Worst Case

Zhaohua Chen; Rui Ai; Mingwei Yang; Yuqi Pan; Chang Wang; Xiaotie Deng

arXiv:2211.13952·cs.LG·December 19, 2024·1 cites

Contextual Decision-Making with Knapsacks Beyond the Worst Case

Zhaohua Chen, Rui Ai, Mingwei Yang, Yuqi Pan, Chang Wang, Xiaotie Deng

PDF

Open Access 1 Video

TL;DR

This paper advances the understanding of resource-constrained decision-making by providing algorithms with near-optimal regret bounds that surpass traditional worst-case guarantees, especially under specific problem structures.

Contribution

It introduces a new algorithm combining re-solving heuristics with distribution estimation that achieves logarithmic regret under certain conditions and extends results to continuous settings.

Findings

01

Achieves (1) regret when the fluid LP has a unique, non-degenerate solution.

02

Proves an unavoidable (( ext{T})) gap in worst-case scenarios.

03

Maintains near-(( ext{T})) regret in worst cases and under different feedback models.

Abstract

We study the framework of a dynamic decision-making scenario with resource constraints. In this framework, an agent, whose target is to maximize the total reward under the initial inventory, selects an action in each round upon observing a random request, leading to a reward and resource consumptions that are further associated with an unknown random external factor. While previous research has already established an $O (T)$ worst-case regret for this problem, this work offers two results that go beyond the worst-case perspective: one for the worst-case gap between benchmarks and another for logarithmic regret rates. We first show that an $Ω (T)$ distance between the commonly used fluid benchmark and the online optimum is unavoidable when the former has a degenerate optimal solution. On the algorithmic side, we merge the re-solving heuristic with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Contextual Decision-Making with Knapsacks Beyond the Worst Case· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Auction Theory and Applications