Multi-task Reinforcement Learning in Reproducing Kernel Hilbert Spaces   via Cross-learning

Juan Cervino; Juan Andres Bazerque; Miguel Calvo-Fullana; Alejandro; Ribeiro

arXiv:2008.11895·eess.SY·November 24, 2021

Multi-task Reinforcement Learning in Reproducing Kernel Hilbert Spaces via Cross-learning

Juan Cervino, Juan Andres Bazerque, Miguel Calvo-Fullana, Alejandro, Ribeiro

PDF

TL;DR

This paper introduces a multi-task reinforcement learning approach using cross-learning in reproducing kernel Hilbert spaces, enabling quick adaptation to related tasks and environments, with theoretical convergence guarantees and practical navigation experiments.

Contribution

It proposes a novel cross-learning framework for multi-task RL in RKHS, providing convergence guarantees and a method for rapid policy adaptation across related tasks.

Findings

01

The approach produces a central policy useful for quick adaptation.

02

The method converges to a near-optimal solution with high probability.

03

Agents successfully navigate new environments with unseen obstacles.

Abstract

Reinforcement learning (RL) is a framework to optimize a control policy using rewards that are revealed by the system as a response to a control action. In its standard form, RL involves a single agent that uses its policy to accomplish a specific task. These methods require large amounts of reward samples to achieve good performance, and may not generalize well when the task is modified, even if the new task is related. In this paper we are interested in a collaborative scheme in which multiple agents with different tasks optimize their policies jointly. To this end, we introduce cross-learning, in which agents tackling related tasks have their policies constrained to be close to one another. Two properties make our new approach attractive: (i) it produces a multi-task central policy that can be used as a starting point to adapt quickly to one of the tasks trained for, in a situation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.