Bandit problems with Levy processes
Asaf Cohen, Eilon Solan

TL;DR
This paper analyzes two-armed bandit problems in continuous time with Levy process payoffs, deriving explicit optimal strategies and payoffs for different arm types, advancing understanding of exploration-exploitation trade-offs.
Contribution
It introduces a continuous-time bandit model with Levy process payoffs and explicitly characterizes the optimal cut-off strategy and payoff.
Findings
Optimal strategy is a cut-off policy
Explicit formulas for cut-off and payoff are derived
Model accommodates stochastic Levy process payoffs
Abstract
Bandit problems model the trade-off between exploration and exploitation in various decision problems. We study two-armed bandit problems in continuous time, where the risky arm can have two types: High or Low; both types yield stochastic payoffs generated by a Levy process. We show that the optimal strategy is a cut-off strategy and we provide an explicit expression for the cut-off and for the optimal payoff.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Optimization and Search Problems
