Undiscounted Bandit Games

Godfrey Keller; Sven Rady

arXiv:1909.13323·econ.TH·August 26, 2020·Games Econ. Behav.

Undiscounted Bandit Games

Godfrey Keller, Sven Rady

PDF

Open Access

TL;DR

This paper studies strategic experimentation in continuous-time two-armed bandit games with unknown payoffs, providing a simple closed-form equilibrium that depends only on key payoff parameters and not on detailed process specifications.

Contribution

It derives a unique symmetric Markov perfect equilibrium for undiscounted bandit games with Lévy process payoffs, simplifying analysis by depending only on essential payoff metrics.

Findings

01

Equilibrium is explicitly characterized in closed form.

02

Equilibrium does not depend on detailed process specifications.

03

Players use Markov strategies based on posterior beliefs.

Abstract

We analyze undiscounted continuous-time games of strategic experimentation with two-armed bandits. The risky arm generates payoffs according to a L\'{e}vy process with an unknown average payoff per unit of time which nature draws from an arbitrary finite set. Observing all actions and realized payoffs, plus a free background signal, players use Markov strategies with the common posterior belief about the unknown parameter as the state variable. We show that the unique symmetric Markov perfect equilibrium can be computed in a simple closed form involving only the payoff of the safe arm, the expected current payoff of the risky arm, and the expected full-information payoff, given the current belief. In particular, the equilibrium does not depend on the precise specification of the payoff-generating processes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExperimental Behavioral Economics Studies · Auction Theory and Applications · Advanced Bandit Algorithms Research