On Transforming Reinforcement Learning by Transformer: The Development   Trajectory

Shengchao Hu; Li Shen; Ya Zhang; Yixin Chen; Dacheng Tao

arXiv:2212.14164·cs.LG·January 24, 2023·1 cites

On Transforming Reinforcement Learning by Transformer: The Development Trajectory

Shengchao Hu, Li Shen, Ya Zhang, Yixin Chen, Dacheng Tao

PDF

Open Access

TL;DR

This paper surveys recent progress in applying transformer models to reinforcement learning, highlighting architecture improvements and trajectory optimization, and discusses future research directions in this rapidly evolving field.

Contribution

It provides a comprehensive review of transformer-based reinforcement learning, categorizing recent advances and analyzing their applications, limitations, and future prospects.

Findings

01

Transformers enhance RL by improving architecture and trajectory modeling.

02

Transformer-based RL methods excel in robotic manipulation, games, navigation, and autonomous driving.

03

Challenges include inherent RL issues like bootstrapping and the 'deadly triad'.

Abstract

Transformer, originally devised for natural language processing, has also attested significant success in computer vision. Thanks to its super expressive power, researchers are investigating ways to deploy transformers to reinforcement learning (RL) and the transformer-based models have manifested their potential in representative RL benchmarks. In this paper, we collect and dissect recent advances on transforming RL by transformer (transformer-based RL or TRL), in order to explore its development trajectory and future trend. We group existing developments in two categories: architecture enhancement and trajectory optimization, and examine the main applications of TRL in robotic manipulation, text-based games, navigation and autonomous driving. For architecture enhancement, these methods consider how to apply the powerful transformer structure to RL problems under the traditional RL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics