Markov decision processes: on the convergence of the Monte-Carlo first visit algorithm

Sylvain Delattre; Nicolas Fournier

arXiv:2501.08800·math.PR·September 23, 2025

Markov decision processes: on the convergence of the Monte-Carlo first visit algorithm

Sylvain Delattre, Nicolas Fournier

PDF

TL;DR

This paper analyzes the convergence of the Monte-Carlo first visit algorithm in Markov decision processes, establishing convergence conditions for discount factors less than 1/2.

Contribution

It provides a proof of convergence for the Monte-Carlo first visit algorithm under specific discount factor constraints.

Findings

01

Convergence is guaranteed when the discount factor is less than 1/2.

02

The algorithm is effective for finite state and action spaces.

03

Theoretical conditions for convergence are established.

Abstract

We consider the Monte-Carlo first visit algorithm, of which the goal is to find the optimal control in a Markov decision process with finite state space and finite number of possible actions. We show its convergence when the discount factor is smaller than $1/2$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.