Bandit problems with Levy processes

Asaf Cohen; Eilon Solan

arXiv:1407.7241·math.PR·August 23, 2015

Bandit problems with Levy processes

Asaf Cohen, Eilon Solan

PDF

Open Access

TL;DR

This paper analyzes two-armed bandit problems in continuous time with Levy process payoffs, deriving explicit optimal strategies and payoffs for different arm types, advancing understanding of exploration-exploitation trade-offs.

Contribution

It introduces a continuous-time bandit model with Levy process payoffs and explicitly characterizes the optimal cut-off strategy and payoff.

Findings

01

Optimal strategy is a cut-off policy

02

Explicit formulas for cut-off and payoff are derived

03

Model accommodates stochastic Levy process payoffs

Abstract

Bandit problems model the trade-off between exploration and exploitation in various decision problems. We study two-armed bandit problems in continuous time, where the risky arm can have two types: High or Low; both types yield stochastic payoffs generated by a Levy process. We show that the optimal strategy is a cut-off strategy and we provide an explicit expression for the cut-off and for the optimal payoff.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Optimization and Search Problems