Efficient Deep Reinforcement Learning via Adaptive Policy Transfer
Tianpei Yang, Jianye Hao, Zhaopeng Meng, Zongzhang Zhang, Yujing Hu,, Yingfeng Cheng, Changjie Fan, Weixun Wang, Wulong Liu, Zhaodong Wang, and, Jiajie Peng

TL;DR
This paper introduces a novel Policy Transfer Framework that optimizes target policies in deep reinforcement learning by adaptively selecting and terminating source policies without explicit similarity measures, leading to faster learning.
Contribution
It presents a new framework that models multi-policy transfer as an option learning problem, enabling adaptive policy reuse without explicit task similarity computation.
Findings
Significantly accelerates RL learning process.
Outperforms state-of-the-art transfer methods in efficiency.
Achieves better final performance in various action spaces.
Abstract
Transfer Learning (TL) has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks. Existing transfer approaches either explicitly computes the similarity between tasks or select appropriate source policies to provide guided explorations for the target task. However, how to directly optimize the target policy by alternatively utilizing knowledge from appropriate source policies without explicitly measuring the similarity is currently missing. In this paper, we propose a novel Policy Transfer Framework (PTF) to accelerate RL by taking advantage of this idea. Our framework learns when and which source policy is the best to reuse for the target policy and when to terminate it by modeling multi-policy transfer as the option learning problem. PTF can be easily combined with existing deep RL approaches.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Fuel Cells and Related Materials · Smart Grid Energy Management
