Finite-Sample Analysis of the Monte Carlo Exploring Starts Algorithm for Reinforcement Learning
Suei-Wen Chen, Keith Ross, Pierre Youssef

TL;DR
This paper provides a finite-sample analysis of the Monte Carlo Exploring Starts algorithm in reinforcement learning, establishing sample complexity bounds for convergence to an optimal policy in stochastic shortest path problems.
Contribution
It introduces a novel finite-sample bound for a modified MCES algorithm, including a convergence rate analysis for policy iteration in stochastic shortest path settings.
Findings
Algorithm returns an optimal policy after $ ilde{O}(SAK^3 ext{log}^3(1/\delta))$ episodes with high probability.
Provides the first finite-sample bound for MCES-style algorithms in stochastic shortest path problems.
Convergence rate depends on states, actions, episode length proxy, and reward bounds.
Abstract
Monte Carlo Exploring Starts (MCES), which aims to learn the optimal policy using only sample returns, is a simple and natural algorithm in reinforcement learning which has been shown to converge under various conditions. However, the convergence rate analysis for MCES-style algorithms in the form of sample complexity has received very little attention. In this paper we develop a finite sample bound for a modified MCES algorithm which solves the stochastic shortest path problem. To this end, we prove a novel result on the convergence rate of the policy iteration algorithm. This result implies that with probability at least , the algorithm returns an optimal policy after sampled episodes, where and denote the number of states and actions respectively, is a proxy for episode length, and hides logarithmic factors…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Simulation Techniques and Applications
