On Transforming Reinforcement Learning by Transformer: The Development Trajectory
Shengchao Hu, Li Shen, Ya Zhang, Yixin Chen, Dacheng Tao

TL;DR
This paper surveys recent progress in applying transformer models to reinforcement learning, highlighting architecture improvements and trajectory optimization, and discusses future research directions in this rapidly evolving field.
Contribution
It provides a comprehensive review of transformer-based reinforcement learning, categorizing recent advances and analyzing their applications, limitations, and future prospects.
Findings
Transformers enhance RL by improving architecture and trajectory modeling.
Transformer-based RL methods excel in robotic manipulation, games, navigation, and autonomous driving.
Challenges include inherent RL issues like bootstrapping and the 'deadly triad'.
Abstract
Transformer, originally devised for natural language processing, has also attested significant success in computer vision. Thanks to its super expressive power, researchers are investigating ways to deploy transformers to reinforcement learning (RL) and the transformer-based models have manifested their potential in representative RL benchmarks. In this paper, we collect and dissect recent advances on transforming RL by transformer (transformer-based RL or TRL), in order to explore its development trajectory and future trend. We group existing developments in two categories: architecture enhancement and trajectory optimization, and examine the main applications of TRL in robotic manipulation, text-based games, navigation and autonomous driving. For architecture enhancement, these methods consider how to apply the powerful transformer structure to RL problems under the traditional RL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
