PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations
Tong Sang, Hongyao Tang, Yi Ma, Jianye Hao, Yan Zheng, Zhaopeng Meng,, Boyan Li, Zhen Wang

TL;DR
PAnDR introduces a method for rapid policy adaptation in new environments by decoupling environment and policy representations learned offline, enabling effective online updates with minimal interaction.
Contribution
The paper proposes a novel offline training framework that learns decoupled environment and policy representations for fast online adaptation in reinforcement learning.
Findings
PAnDR outperforms existing algorithms in policy adaptation tasks.
Decoupled representations improve adaptation speed and effectiveness.
Mutual information optimization enhances representation quality.
Abstract
Deep Reinforcement Learning (DRL) has been a promising solution to many complex decision-making problems. Nevertheless, the notorious weakness in generalization among environments prevent widespread application of DRL agents in real-world scenarios. Although advances have been made recently, most prior works assume sufficient online interaction on training environments, which can be costly in practical cases. To this end, we focus on an offline-training-online-adaptation setting, in which the agent first learns from offline experiences collected in environments with different dynamics and then performs online policy adaptation in environments with new dynamics. In this paper, we propose Policy Adaptation with Decoupled Representations (PAnDR) for fast policy adaptation. In offline training phase, the environment representation and policy representation are learned through contrastive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsContrastive Learning
