Analyzing Risky Choices: Q-Learning for Deal-No Deal

Laszlo Korsos; Nicholas G. Polson

arXiv:1110.0883·stat.AP·October 6, 2011

Analyzing Risky Choices: Q-Learning for Deal-No Deal

Laszlo Korsos, Nicholas G. Polson

PDF

Open Access

TL;DR

This paper derives an optimal strategy for the Deal or No Deal game show using Q-learning, analyzes contestants' risky choices, and estimates their risk aversion levels based on observed strategies.

Contribution

It introduces a Q-learning based method to analyze sequential decision making and infers risk aversion levels from game strategies, highlighting consistency with constant risk aversion.

Findings

01

Contestants' strategies are mostly consistent with constant risk aversion.

02

Optimal strategy derived using Q-learning aligns with observed choices.

03

Last risky choice shows deviation from constant risk aversion.

Abstract

We derive an optimal strategy in the popular Deal or No Deal game show. Q-learning quantifies the continuation value inherent in sequential decision making and we use this to analyze contestants risky choices. Given their choices and optimal strategy, we invert to find implied bounds on their levels of risk aversion. In risky decision making, previous empirical evidence has suggested that past outcomes affect future choices and that contestants have time-varying risk aversion. We demonstrate that the strategies of two players (Suzanne and Frank) from the European version of the game are consistent with constant risk aversion levels except for their last risk-seeking choice.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDecision-Making and Behavioral Economics · Experimental Behavioral Economics Studies · Sports Analytics and Performance