PAnDR: Fast Adaptation to New Environments from Offline Experiences via   Decoupling Policy and Environment Representations

Tong Sang; Hongyao Tang; Yi Ma; Jianye Hao; Yan Zheng; Zhaopeng Meng,; Boyan Li; Zhen Wang

arXiv:2204.02877·cs.LG·May 31, 2022

PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations

Tong Sang, Hongyao Tang, Yi Ma, Jianye Hao, Yan Zheng, Zhaopeng Meng,, Boyan Li, Zhen Wang

PDF

Open Access 1 Repo

TL;DR

PAnDR introduces a method for rapid policy adaptation in new environments by decoupling environment and policy representations learned offline, enabling effective online updates with minimal interaction.

Contribution

The paper proposes a novel offline training framework that learns decoupled environment and policy representations for fast online adaptation in reinforcement learning.

Findings

01

PAnDR outperforms existing algorithms in policy adaptation tasks.

02

Decoupled representations improve adaptation speed and effectiveness.

03

Mutual information optimization enhances representation quality.

Abstract

Deep Reinforcement Learning (DRL) has been a promising solution to many complex decision-making problems. Nevertheless, the notorious weakness in generalization among environments prevent widespread application of DRL agents in real-world scenarios. Although advances have been made recently, most prior works assume sufficient online interaction on training environments, which can be costly in practical cases. To this end, we focus on an offline-training-online-adaptation setting, in which the agent first learns from offline experiences collected in environments with different dynamics and then performs online policy adaptation in environments with new dynamics. In this paper, we propose Policy Adaptation with Decoupled Representations (PAnDR) for fast policy adaptation. In offline training phase, the environment representation and policy representation are learned through contrastive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tju-drl-lab/self-supervised-rl
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsContrastive Learning