Learning a subspace of policies for online adaptation in Reinforcement   Learning

Jean-Baptiste Gaya; Laure Soulier; Ludovic Denoyer

arXiv:2110.05169·cs.LG·October 25, 2022

Learning a subspace of policies for online adaptation in Reinforcement Learning

Jean-Baptiste Gaya, Laure Soulier, Ludovic Denoyer

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a method for online adaptation in reinforcement learning by learning a subspace of policies, enabling better generalization to unseen environment variations without extensive tuning or additional components.

Contribution

The authors propose a novel subspace policy learning approach that improves generalization and online adaptation in RL without relying on meta-RL or extra modules.

Findings

01

Outperforms baseline methods on various benchmarks.

02

Learns policies that achieve high rewards in unseen environments.

03

Simple to tune and does not require extra components.

Abstract

Deep Reinforcement Learning (RL) is mainly studied in a setting where the training and the testing environments are similar. But in many practical applications, these environments may differ. For instance, in control systems, the robot(s) on which a policy is learned might differ from the robot(s) on which a policy will run. It can be caused by different internal factors (e.g., calibration issues, system attrition, defective modules) or also by external changes (e.g., weather conditions). There is a need to develop RL methods that generalize well to variations of the training conditions. In this article, we consider the simplest yet hard to tackle generalization setting where the test environment is unknown at train time, forcing the agent to adapt to the system's new dynamics. This online adaptation process can be computationally expensive (e.g., fine-tuning) and cannot rely on meta-RL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/salina
jaxOfficial

Videos

Learning a subspace of policies for online adaptation in Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Software Reliability and Analysis Research · Adversarial Robustness in Machine Learning

MethodsTest