Model-Based Reinforcement Learning with a Generative Model is Minimax   Optimal

Alekh Agarwal; Sham Kakade; Lin F. Yang

arXiv:1906.03804·cs.LG·April 7, 2020·33 cites

Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal

Alekh Agarwal, Sham Kakade, Lin F. Yang

PDF

Open Access

TL;DR

This paper proves that a simple plug-in approach using a generative model for MDPs is minimax optimal in the non-asymptotic regime, matching the best possible policy quality with minimal sample complexity.

Contribution

It establishes the minimax optimality of the naive plug-in method in model-based reinforcement learning with a generative model, using a novel analysis technique.

Findings

01

Plug-in approach achieves minimax optimal policy quality.

02

Any efficient planning algorithm can be used in the empirical MDP.

03

Introduces a novel absorbing MDP construction for analysis.

Abstract

This work considers the sample and computational complexity of obtaining an $ϵ$ -optimal policy in a discounted Markov Decision Process (MDP), given only access to a generative model. In this work, we study the effectiveness of the most natural plug-in approach to model-based planning: we build the maximum likelihood estimate of the transition model in the MDP from observations and then find an optimal policy in this empirical MDP. We ask arguably the most basic and unresolved question in model based planning: is the naive "plug-in" approach, non-asymptotically, minimax optimal in the quality of the policy it finds, given a fixed sample size? Here, the non-asymptotic regime refers to when the sample size is sublinear in the model size. With access to a generative model, we resolve this question in the strongest possible sense: our main result shows that \emph{any} high accuracy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Algorithms · Formal Methods in Verification