IOB: Integrating Optimization Transfer and Behavior Transfer for   Multi-Policy Reuse

Siyuan Li; Hao Li; Jin Zhang; Zhen Wang; Peng Liu; Chongjie Zhang

arXiv:2308.07351·cs.LG·August 16, 2023

IOB: Integrating Optimization Transfer and Behavior Transfer for Multi-Policy Reuse

Siyuan Li, Hao Li, Jin Zhang, Zhen Wang, Peng Liu, Chongjie Zhang

PDF

Open Access

TL;DR

This paper introduces IOB, a novel transfer reinforcement learning method that effectively combines optimization and behavior transfer by selecting source policies based on one-step improvement, enhancing transfer efficiency and performance.

Contribution

The paper proposes a new transfer RL approach that selects source policies without extra training components, using the Q function for guidance, and integrates optimization and behavior transfer for better results.

Findings

01

Outperforms state-of-the-art transfer RL baselines on benchmark tasks.

02

Enhances final performance and transferability in continual learning.

03

Guarantees improvement in target policy learning.

Abstract

Humans have the ability to reuse previously learned policies to solve new tasks quickly, and reinforcement learning (RL) agents can do the same by transferring knowledge from source policies to a related target task. Transfer RL methods can reshape the policy optimization objective (optimization transfer) or influence the behavior policy (behavior transfer) using source policies. However, selecting the appropriate source policy with limited samples to guide target policy learning has been a challenge. Previous methods introduce additional components, such as hierarchical policies or estimations of source policies' value functions, which can lead to non-stationary policy optimization or heavy sampling costs, diminishing transfer effectiveness. To address this challenge, we propose a novel transfer RL method that selects the source policy without training extra components. Our method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFuel Cells and Related Materials · Reinforcement Learning in Robotics