Model Generation with Provable Coverability for Offline Reinforcement   Learning

Chengxing Jia; Hao Yin; Chenxiao Gao; Tian Xu; Lei Yuan; and Zongzhang Zhang; Yang Yu

arXiv:2206.00316·cs.LG·June 9, 2022

Model Generation with Provable Coverability for Offline Reinforcement Learning

Chengxing Jia, Hao Yin, Chenxiao Gao, Tian Xu, Lei Yuan, and Zongzhang Zhang, Yang Yu

PDF

Open Access

TL;DR

This paper introduces a provably coverable model generation method for offline reinforcement learning, improving out-of-distribution generalization and transfer performance by generating models that better approximate real dynamics.

Contribution

We propose a novel algorithm for generating models with guaranteed coverage of real dynamics, backed by theoretical analysis and improved empirical performance.

Findings

01

Outperforms prior offline RL methods on benchmarks

02

Models exhibit better zero-shot transfer performance

03

Provides theoretical guarantees on model coverability

Abstract

Model-based offline optimization with dynamics-aware policy provides a new perspective for policy learning and out-of-distribution generalization, where the learned policy could adapt to different dynamics enumerated at the training stage. But due to the limitation under the offline setting, the learned model could not mimic real dynamics well enough to support reliable out-of-distribution exploration, which still hinders policy to generalize well. To narrow the gap, previous works roughly ensemble randomly initialized models to better approximate the real dynamics. However, such practice is costly and inefficient, and provides no guarantee on how well the real dynamics could be approximated by the learned models, which we name coverability in this paper. We actively address this issue by generating models with provable ability to cover real dynamics in an efficient and controllable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification