AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning
Biwei Huang, Fan Feng, Chaochao Lu, Sara Magliacane, Kun Zhang

TL;DR
AdaRL introduces a framework for rapid and reliable transfer reinforcement learning by using graphical representations to efficiently adapt policies across domains with minimal samples, even in partially observable environments.
Contribution
It proposes a novel graphical representation-based method for efficient policy adaptation in transfer RL, reducing sample complexity and avoiding further policy optimization.
Findings
Effective adaptation with few samples in Cartpole and Atari environments.
Compact graphical representations encode structural changes across domains.
Significantly reduces adaptation time compared to existing methods.
Abstract
One practical challenge in reinforcement learning (RL) is how to make quick adaptations when faced with new environments. In this paper, we propose a principled framework for adaptive RL, called \textit{AdaRL}, that adapts reliably and efficiently to changes across domains with a few samples from the target domain, even in partially observable environments. Specifically, we leverage a parsimonious graphical representation that characterizes structural relationships over variables in the RL system. Such graphical representations provide a compact way to encode what and where the changes across domains are, and furthermore inform us with a minimal set of changes that one has to consider for the purpose of policy adaptation. We show that by explicitly leveraging this compact representation to encode changes, we can efficiently adapt the policy to the target domain, in which only a few…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Software Engineering Research
