Risk-Aware Algorithms for Adversarial Contextual Bandits

Wen Sun; Debadeepta Dey; and Ashish Kapoor

arXiv:1610.05129·cs.LG·October 18, 2016·1 cites

Risk-Aware Algorithms for Adversarial Contextual Bandits

Wen Sun, Debadeepta Dey, and Ashish Kapoor

PDF

Open Access

TL;DR

This paper introduces algorithms for adversarial contextual bandits that incorporate risk constraints, balancing cost minimization with adherence to long-term risk thresholds in both full information and bandit settings.

Contribution

It develops a meta algorithm using online mirror descent and extends it to risk-aware contextual bandits with expert advice, achieving near-optimal regret and sublinear risk violation.

Findings

01

Achieves near-optimal regret in cost minimization.

02

Maintains sublinear growth of risk constraint violation.

03

Extends full information algorithms to bandit setting with risk constraints.

Abstract

In this work we consider adversarial contextual bandits with risk constraints. At each round, nature prepares a context, a cost for each arm, and additionally a risk for each arm. The learner leverages the context to pull an arm and then receives the corresponding cost and risk associated with the pulled arm. In addition to minimizing the cumulative cost, the learner also needs to satisfy long-term risk constraints -- the average of the cumulative risk from all pulled arms should not be larger than a pre-defined threshold. To address this problem, we first study the full information setting where in each round the learner receives an adversarial convex loss and a convex constraint. We develop a meta algorithm leveraging online mirror descent for the full information setting and extend it to contextual bandit with risk constraints setting using expert advice. Our algorithms can achieve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics