Predictable MDP Abstraction for Unsupervised Model-Based RL

Seohong Park; Sergey Levine

arXiv:2302.03921·cs.LG·June 6, 2023·1 cites

Predictable MDP Abstraction for Unsupervised Model-Based RL

Seohong Park, Sergey Levine

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces Predictable MDP Abstraction (PMA), an unsupervised method that transforms complex MDPs into simpler, predictable forms to improve model-based RL performance without additional environment interactions.

Contribution

The paper proposes a novel unsupervised approach to learn an action space transformation that simplifies MDP prediction tasks, enabling more accurate models and zero-shot downstream control.

Findings

01

PMA significantly improves model accuracy over prior methods.

02

PMA enables zero-shot control on benchmark tasks.

03

The approach is theoretically sound and empirically validated.

Abstract

A key component of model-based reinforcement learning (RL) is a dynamics model that predicts the outcomes of actions. Errors in this predictive model can degrade the performance of model-based controllers, and complex Markov decision processes (MDPs) can present exceptionally difficult prediction problems. To mitigate this issue, we propose predictable MDP abstraction (PMA): instead of training a predictive model on the original MDP, we train a model on a transformed MDP with a learned action space that only permits predictable, easy-to-model actions, while covering the original state-action space as much as possible. As a result, model learning becomes easier and more accurate, which allows robust, stable model-based planning or model-based RL. This transformation is learned in an unsupervised manner, before any task is specified by the user. Downstream tasks can then be solved with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Predictable MDP Abstraction for Unsupervised Model-Based RL· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Anomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning