Prototypical context-aware dynamics generalization for high-dimensional   model-based reinforcement learning

Junjie Wang; Yao Mu; Dong Li; Qichao Zhang; Dongbin Zhao; Yuzheng; Zhuang; Ping Luo; Bin Wang; Jianye Hao

arXiv:2211.12774·cs.LG·November 24, 2022

Prototypical context-aware dynamics generalization for high-dimensional model-based reinforcement learning

Junjie Wang, Yao Mu, Dong Li, Qichao Zhang, Dongbin Zhao, Yuzheng, Zhuang, Ping Luo, Bin Wang, Jianye Hao

PDF

Open Access

TL;DR

This paper introduces ProtoCAD, a novel model that enhances high-dimensional model-based reinforcement learning by capturing environment context through prototypes, significantly improving dynamics generalization across diverse tasks.

Contribution

ProtoCAD is the first to incorporate temporally consistent prototypes and combined context representations for better dynamics generalization in high-dimensional control tasks.

Findings

01

ProtoCAD outperforms existing methods in dynamics generalization.

02

It achieves 13.2% and 26.7% better mean and median performance than RSSM.

03

Extensive experiments validate its superior generalization ability.

Abstract

The latent world model provides a promising way to learn policies in a compact latent space for tasks with high-dimensional observations, however, its generalization across diverse environments with unseen dynamics remains challenging. Although the recurrent structure utilized in current advances helps to capture local dynamics, modeling only state transitions without an explicit understanding of environmental context limits the generalization ability of the dynamics model. To address this issue, we propose a Prototypical Context-Aware Dynamics (ProtoCAD) model, which captures the local dynamics by time consistent latent context and enables dynamics generalization in high-dimensional control tasks. ProtoCAD extracts useful contextual information with the help of the prototypes clustered over batch and benefits model-based RL in two folds: 1) It utilizes a temporally consistent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics