Generalizable Task Representation Learning for Offline Meta-Reinforcement Learning with Data Limitations
Renzhe Zhou, Chen-Xiao Gao, Zongzhang Zhang, Yang Yu

TL;DR
This paper introduces GENTLE, a novel offline meta-reinforcement learning method that learns generalizable task representations from limited data by using a task auto-encoder optimized through state transition and reward reconstruction, improving performance on diverse tasks.
Contribution
GENTLE is the first approach to effectively learn task representations under data limitations by employing a reconstruction-based auto-encoder and pseudo-transitions, enhancing generalization in offline meta-RL.
Findings
GENTLE outperforms existing methods on in-distribution tasks.
GENTLE achieves superior results on out-of-distribution tasks.
GENTLE maintains robustness across different testing protocols.
Abstract
Generalization and sample efficiency have been long-standing issues concerning reinforcement learning, and thus the field of Offline Meta-Reinforcement Learning~(OMRL) has gained increasing attention due to its potential of solving a wide range of problems with static and limited offline data. Existing OMRL methods often assume sufficient training tasks and data coverage to apply contrastive learning to extract task representations. However, such assumptions are not applicable in several real-world applications and thus undermine the generalization ability of the representations. In this paper, we consider OMRL with two types of data limitations: limited training tasks and limited behavior diversity and propose a novel algorithm called GENTLE for learning generalizable task representations in the face of data limitations. GENTLE employs Task Auto-Encoder~(TAE), which is an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
MethodsContrastive Learning · ALIGN
