The Gambler's Problem and Beyond

Baoxiang Wang; Shuai Li; Jiajin Li; Siu On Chan

arXiv:2001.00102·stat.ML·July 14, 2020

The Gambler's Problem and Beyond

Baoxiang Wang, Shuai Li, Jiajin Li, Siu On Chan

PDF

Open Access

TL;DR

This paper provides an exact, detailed analysis of the optimal value function in the Gambler's problem, revealing its fractal, non-smooth nature and offering insights for reinforcement learning algorithms.

Contribution

We derive the exact formula for the optimal value function in the Gambler's problem, uncovering its complex fractal structure and properties.

Findings

01

The value function is fractal and self-similar.

02

It exhibits non-smooth points with derivatives of zero or infinity.

03

The function is a generalized Cantor function with complex properties.

Abstract

We analyze the Gambler's problem, a simple reinforcement learning problem where the gambler has the chance to double or lose the bets until the target is reached. This is an early example introduced in the reinforcement learning textbook by Sutton and Barto (2018), where they mention an interesting pattern of the optimal value function with high-frequency components and repeating non-smooth points. It is however without further investigation. We provide the exact formula for the optimal value function for both the discrete and the continuous cases. Though simple as it might seem, the value function is pathological: fractal, self-similar, derivative taking either zero or infinity, and not written as elementary functions. It is in fact one of the generalized Cantor functions, where it holds a complexity that has been uncharted thus far. Our analyses could provide insights into improving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Artificial Intelligence in Games