Reinforcement Learning in the Wild with Maximum Likelihood-based Model   Transfer

Hannes Eriksson; Debabrota Basu; Tommy Tram; Mina Alibeigi; Christos; Dimitrakakis

arXiv:2302.09273·cs.LG·February 21, 2023·1 cites

Reinforcement Learning in the Wild with Maximum Likelihood-based Model Transfer

Hannes Eriksson, Debabrota Basu, Tommy Tram, Mina Alibeigi, Christos, Dimitrakakis

PDF

Open Access

TL;DR

This paper introduces a novel transfer learning approach for reinforcement learning called MLEMTRL, which efficiently transfers models to new MDPs using maximum likelihood estimation and planning, with proven regret bounds and empirical success.

Contribution

It proposes a two-stage algorithm for model transfer in RL that combines constrained maximum likelihood estimation with model-based planning, applicable to discrete and continuous MDPs.

Findings

01

Faster learning in new MDPs compared to from-scratch methods

02

Achieves near-optimal performance based on model similarity

03

Provides theoretical regret bounds for the proposed method

Abstract

In this paper, we study the problem of transferring the available Markov Decision Process (MDP) models to learn and plan efficiently in an unknown but similar MDP. We refer to it as \textit{Model Transfer Reinforcement Learning (MTRL)} problem. First, we formulate MTRL for discrete MDPs and Linear Quadratic Regulators (LQRs) with continuous state actions. Then, we propose a generic two-stage algorithm, MLEMTRL, to address the MTRL problem in discrete and continuous settings. In the first stage, MLEMTRL uses a \textit{constrained Maximum Likelihood Estimation (MLE)}-based approach to estimate the target MDP model using a set of known MDP models. In the second stage, using the estimated target MDP model, MLEMTRL deploys a model-based planning algorithm appropriate for the MDP class. Theoretically, we prove worst-case regret bounds for MLEMTRL both in realisable and non-realisable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics