RL + Transformer = A General-Purpose Problem Solver
Micah Rentschler, Jesse Roberts

TL;DR
This paper introduces a transformer-based meta-learner fine-tuned with reinforcement learning that can solve new problems, adapt to changing environments, and improve its solutions, demonstrating emergent in-context learning abilities.
Contribution
It presents the novel concept of In-Context Reinforcement Learning (ICRL) where a pre-trained transformer learns to solve unseen problems through reinforcement learning.
Findings
Achieves strong performance on unseen in-distribution environments
Demonstrates robustness to training data quality and out-of-distribution environments
Shows ability to adapt to non-stationary environments and improve solutions iteratively
Abstract
What if artificial intelligence could not only solve problems for which it was trained but also learn to teach itself to solve new problems (i.e., meta-learn)? In this study, we demonstrate that a pre-trained transformer fine-tuned with reinforcement learning over multiple episodes develops the ability to solve problems that it has never encountered before - an emergent ability called In-Context Reinforcement Learning (ICRL). This powerful meta-learner not only excels in solving unseen in-distribution environments with remarkable sample efficiency, but also shows strong performance in out-of-distribution environments. In addition, we show that it exhibits robustness to the quality of its training data, seamlessly stitches together behaviors from its context, and adapts to non-stationary environments. These behaviors demonstrate that an RL-trained transformer can iteratively improve upon…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDigital Filter Design and Implementation · Advanced Algorithms and Applications
