Entropy Regularized Task Representation Learning for Offline   Meta-Reinforcement Learning

Mohammadreza Nakhaei; Aidan Scannell; Joni Pajarinen

arXiv:2412.14834·cs.LG·January 23, 2025

Entropy Regularized Task Representation Learning for Offline Meta-Reinforcement Learning

Mohammadreza Nakhaei, Aidan Scannell, Joni Pajarinen

PDF

Open Access 1 Repo

TL;DR

This paper introduces an entropy regularization method for offline meta-reinforcement learning that improves task representation quality and enhances generalization to new tasks by reducing overfitting to offline data.

Contribution

The paper proposes a novel entropy regularization approach that minimizes mutual information between task representations and behavior policy, addressing distribution mismatch in offline meta-RL.

Findings

01

Task representations better capture underlying tasks.

02

Improved performance on in-distribution tasks.

03

Enhanced generalization to out-of-distribution tasks.

Abstract

Offline meta-reinforcement learning aims to equip agents with the ability to rapidly adapt to new tasks by training on data from a set of different tasks. Context-based approaches utilize a history of state-action-reward transitions -- referred to as the context -- to infer representations of the current task, and then condition the agent, i.e., the policy and value function, on the task representations. Intuitively, the better the task representations capture the underlying tasks, the better the agent can generalize to new tasks. Unfortunately, context-based approaches suffer from distribution mismatch, as the context in the offline data does not match the context at test time, limiting their ability to generalize to the test tasks. This leads to the task representations overfitting to the offline training data. Intuitively, the task representations should be independent of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mohammadrezanakhaei/er-trl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control

MethodsSparse Evolutionary Training