Meta Learning MDPs with Linear Transition Models

Robert M\"uller; Aldo Pacchiano

arXiv:2201.08732·cs.LG·January 24, 2022

Meta Learning MDPs with Linear Transition Models

Robert M\"uller, Aldo Pacchiano

PDF

Open Access

TL;DR

This paper introduces a meta-learning approach for MDPs with linear transition models, leveraging shared structure to improve transfer learning efficiency across tasks.

Contribution

It proposes BUC-MatrixRL, an algorithm that effectively uses sampled tasks to quickly adapt to new tasks by estimating the bias parameter in linear transition models.

Findings

01

BUC-MatrixRL significantly reduces transfer regret in high bias, low variance task distributions.

02

The analysis extends linear regression and bandit results to MDPs with linear transitions.

03

The method demonstrates improved sample efficiency over learning tasks independently.

Abstract

We study meta-learning in Markov Decision Processes (MDP) with linear transition models in the undiscounted episodic setting. Under a task sharedness metric based on model proximity we study task families characterized by a distribution over models specified by a bias term and a variance component. We then propose BUC-MatrixRL, a version of the UC-Matrix RL algorithm, and show it can meaningfully leverage a set of sampled training tasks to quickly solve a test task sampled from the same task distribution by learning an estimator of the bias parameter of the task distribution. The analysis leverages and extends results in the learning to learn linear regression and linear bandit setting to the more general case of MDP's with linear transition models. We prove that compared to learning the tasks in isolation, BUC-Matrix RL provides significant improvements in the transfer regret for high…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning

MethodsLinear Regression