Markov decision processes: on the convergence of the Monte-Carlo first visit algorithm
Sylvain Delattre, Nicolas Fournier

TL;DR
This paper analyzes the convergence of the Monte-Carlo first visit algorithm in Markov decision processes, establishing convergence conditions for discount factors less than 1/2.
Contribution
It provides a proof of convergence for the Monte-Carlo first visit algorithm under specific discount factor constraints.
Findings
Convergence is guaranteed when the discount factor is less than 1/2.
The algorithm is effective for finite state and action spaces.
Theoretical conditions for convergence are established.
Abstract
We consider the Monte-Carlo first visit algorithm, of which the goal is to find the optimal control in a Markov decision process with finite state space and finite number of possible actions. We show its convergence when the discount factor is smaller than .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
