Reinforcement Learning in the Wild with Maximum Likelihood-based Model Transfer
Hannes Eriksson, Debabrota Basu, Tommy Tram, Mina Alibeigi, Christos, Dimitrakakis

TL;DR
This paper introduces a novel transfer learning approach for reinforcement learning called MLEMTRL, which efficiently transfers models to new MDPs using maximum likelihood estimation and planning, with proven regret bounds and empirical success.
Contribution
It proposes a two-stage algorithm for model transfer in RL that combines constrained maximum likelihood estimation with model-based planning, applicable to discrete and continuous MDPs.
Findings
Faster learning in new MDPs compared to from-scratch methods
Achieves near-optimal performance based on model similarity
Provides theoretical regret bounds for the proposed method
Abstract
In this paper, we study the problem of transferring the available Markov Decision Process (MDP) models to learn and plan efficiently in an unknown but similar MDP. We refer to it as \textit{Model Transfer Reinforcement Learning (MTRL)} problem. First, we formulate MTRL for discrete MDPs and Linear Quadratic Regulators (LQRs) with continuous state actions. Then, we propose a generic two-stage algorithm, MLEMTRL, to address the MTRL problem in discrete and continuous settings. In the first stage, MLEMTRL uses a \textit{constrained Maximum Likelihood Estimation (MLE)}-based approach to estimate the target MDP model using a set of known MDP models. In the second stage, using the estimated target MDP model, MLEMTRL deploys a model-based planning algorithm appropriate for the MDP class. Theoretically, we prove worst-case regret bounds for MLEMTRL both in realisable and non-realisable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
