RL + Transformer = A General-Purpose Problem Solver

Micah Rentschler; Jesse Roberts

arXiv:2501.14176·cs.LG·January 27, 2025

RL + Transformer = A General-Purpose Problem Solver

Micah Rentschler, Jesse Roberts

PDF

Open Access 1 Video

TL;DR

This paper introduces a transformer-based meta-learner fine-tuned with reinforcement learning that can solve new problems, adapt to changing environments, and improve its solutions, demonstrating emergent in-context learning abilities.

Contribution

It presents the novel concept of In-Context Reinforcement Learning (ICRL) where a pre-trained transformer learns to solve unseen problems through reinforcement learning.

Findings

01

Achieves strong performance on unseen in-distribution environments

02

Demonstrates robustness to training data quality and out-of-distribution environments

03

Shows ability to adapt to non-stationary environments and improve solutions iteratively

Abstract

What if artificial intelligence could not only solve problems for which it was trained but also learn to teach itself to solve new problems (i.e., meta-learn)? In this study, we demonstrate that a pre-trained transformer fine-tuned with reinforcement learning over multiple episodes develops the ability to solve problems that it has never encountered before - an emergent ability called In-Context Reinforcement Learning (ICRL). This powerful meta-learner not only excels in solving unseen in-distribution environments with remarkable sample efficiency, but also shows strong performance in out-of-distribution environments. In addition, we show that it exhibits robustness to the quality of its training data, seamlessly stitches together behaviors from its context, and adapts to non-stationary environments. These behaviors demonstrate that an RL-trained transformer can iteratively improve upon…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

RL + Transformer = A General-Purpose Problem Solver· underline

Taxonomy

TopicsDigital Filter Design and Implementation · Advanced Algorithms and Applications