Efficient Multi-Task Reinforcement Learning with Cross-Task Policy Guidance

Jinmin He; Kai Li; Yifan Zang; Haobo Fu; Qiang Fu; Junliang Xing; Jian Cheng

arXiv:2507.06615·cs.LG·July 10, 2025

Efficient Multi-Task Reinforcement Learning with Cross-Task Policy Guidance

Jinmin He, Kai Li, Yifan Zang, Haobo Fu, Qiang Fu, Junliang Xing, Jian Cheng

PDF

Open Access 1 Video

TL;DR

This paper introduces Cross-Task Policy Guidance (CTPG), a framework that uses proficient task policies to guide unmastered tasks in multi-task reinforcement learning, significantly improving learning efficiency and performance.

Contribution

The paper proposes a novel CTPG framework that explicitly leverages cross-task policy guidance and gating mechanisms to enhance multi-task reinforcement learning.

Findings

01

CTPG improves learning speed in manipulation and locomotion tasks.

02

Incorporating CTPG significantly boosts performance over existing methods.

03

Gating mechanisms effectively filter beneficial policies and tasks.

Abstract

Multi-task reinforcement learning endeavors to efficiently leverage shared information across various tasks, facilitating the simultaneous learning of multiple tasks. Existing approaches primarily focus on parameter sharing with carefully designed network structures or tailored optimization procedures. However, they overlook a direct and complementary way to exploit cross-task similarities: the control policies of tasks already proficient in some skills can provide explicit guidance for unmastered tasks to accelerate skills acquisition. To this end, we present a novel framework called Cross-Task Policy Guidance (CTPG), which trains a guide policy for each task to select the behavior policy interacting with the environment from all tasks' control policies, generating better training trajectories. In addition, we propose two gating mechanisms to improve the learning efficiency of CTPG:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Efficient Multi-task Reinforcement Learning with Cross-Task Policy Guidance· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Domain Adaptation and Few-Shot Learning