Scrutinize What We Ignore: Reining In Task Representation Shift Of   Context-Based Offline Meta Reinforcement Learning

Hai Zhang; Boyuan Zheng; Tianying Ji; Jinhang Liu; Anqi Guo; Junqiao; Zhao; Lanqing Li

arXiv:2405.12001·cs.LG·February 4, 2025

Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-Based Offline Meta Reinforcement Learning

Hai Zhang, Boyuan Zheng, Tianying Ji, Jinhang Liu, Anqi Guo, Junqiao, Zhao, Lanqing Li

PDF

Open Access 1 Repo

TL;DR

This paper investigates the theoretical underpinnings of offline meta reinforcement learning, identifying task representation shift as a key factor affecting performance, and proposes conditions to ensure monotonic improvements.

Contribution

It introduces the concept of task representation shift, provides theoretical guarantees for performance improvements, and clarifies the relationship between context encoder updates and RL performance.

Findings

01

Linked optimization framework with RL return maximization

02

Identified task representation shift as a performance factor

03

Proved conditions for monotonic performance improvements

Abstract

Offline meta reinforcement learning (OMRL) has emerged as a promising approach for interaction avoidance and strong generalization performance by leveraging pre-collected data and meta-learning techniques. Previous context-based approaches predominantly rely on the intuition that alternating optimization between the context encoder and the policy can lead to performance improvements, as long as the context encoder follows the principle of maximizing the mutual information between the task variable $M$ and its latent representation $Z$ ( $I (Z; M)$ ) while the policy adopts the standard offline reinforcement learning (RL) algorithms conditioning on the learned task representation.Despite promising results, the theoretical justification of performance improvements for such intuition remains underexplored.Inspired by the return discrepancy scheme in the model-based RL field, we find that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

betray12138/task-representation-shift
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics