Learning to Modulate pre-trained Models in RL

Thomas Schmied; Markus Hofmarcher; Fabian Paischer; Razvan Pascanu,; Sepp Hochreiter

arXiv:2306.14884·cs.LG·October 30, 2023·1 cites

Learning to Modulate pre-trained Models in RL

Thomas Schmied, Markus Hofmarcher, Fabian Paischer, Razvan Pascanu,, Sepp Hochreiter

PDF

Open Access 1 Repo 2 Datasets 1 Video

TL;DR

This paper investigates catastrophic forgetting in RL when fine-tuning pre-trained models on new tasks, and proposes a novel modulation method that retains pre-trained skills while adapting to new tasks, achieving state-of-the-art results.

Contribution

The paper introduces Learning-to-Modulate (L2M), a new method that prevents forgetting in RL fine-tuning by modulating a frozen pre-trained model's information flow.

Findings

01

L2M outperforms existing fine-tuning methods on Continual-World.

02

Most fine-tuning approaches cause significant performance degradation on pre-training tasks.

03

L2M retains pre-trained skills while achieving high performance on new tasks.

Abstract

Reinforcement Learning (RL) has been successful in various domains like robotics, game playing, and simulation. While RL agents have shown impressive capabilities in their specific tasks, they insufficiently adapt to new tasks. In supervised learning, this adaptation problem is addressed by large-scale pre-training followed by fine-tuning to new down-stream tasks. Recently, pre-training on multiple tasks has been gaining traction in RL. However, fine-tuning a pre-trained model often suffers from catastrophic forgetting. That is, the performance on the pre-training tasks deteriorates when fine-tuning on new tasks. To investigate the catastrophic forgetting phenomenon, we first jointly pre-train a model on datasets from two benchmark suites, namely Meta-World and DMControl. Then, we evaluate and compare a variety of fine-tuning methods prevalent in natural language processing, both in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ml-jku/l2m
pytorchOfficial

Datasets

Videos

Learning to Modulate pre-trained Models in RL· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Machine Learning and Data Classification