Generalizable Task Representation Learning for Offline   Meta-Reinforcement Learning with Data Limitations

Renzhe Zhou; Chen-Xiao Gao; Zongzhang Zhang; Yang Yu

arXiv:2312.15909·cs.LG·December 27, 2023·1 cites

Generalizable Task Representation Learning for Offline Meta-Reinforcement Learning with Data Limitations

Renzhe Zhou, Chen-Xiao Gao, Zongzhang Zhang, Yang Yu

PDF

Open Access 1 Repo

TL;DR

This paper introduces GENTLE, a novel offline meta-reinforcement learning method that learns generalizable task representations from limited data by using a task auto-encoder optimized through state transition and reward reconstruction, improving performance on diverse tasks.

Contribution

GENTLE is the first approach to effectively learn task representations under data limitations by employing a reconstruction-based auto-encoder and pseudo-transitions, enhancing generalization in offline meta-RL.

Findings

01

GENTLE outperforms existing methods on in-distribution tasks.

02

GENTLE achieves superior results on out-of-distribution tasks.

03

GENTLE maintains robustness across different testing protocols.

Abstract

Generalization and sample efficiency have been long-standing issues concerning reinforcement learning, and thus the field of Offline Meta-Reinforcement Learning~(OMRL) has gained increasing attention due to its potential of solving a wide range of problems with static and limited offline data. Existing OMRL methods often assume sufficient training tasks and data coverage to apply contrastive learning to extract task representations. However, such assumptions are not applicable in several real-world applications and thus undermine the generalization ability of the representations. In this paper, we consider OMRL with two types of data limitations: limited training tasks and limited behavior diversity and propose a novel algorithm called GENTLE for learning generalizable task representations in the face of data limitations. GENTLE employs Task Auto-Encoder~(TAE), which is an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lamda-rl/gentle
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification

MethodsContrastive Learning · ALIGN