Decision Transformer: Reinforcement Learning via Sequence Modeling

Lili Chen; Kevin Lu; Aravind Rajeswaran; Kimin Lee; Aditya Grover,; Michael Laskin; Pieter Abbeel; Aravind Srinivas; Igor Mordatch

arXiv:2106.01345·cs.LG·June 25, 2021·465 cites

Decision Transformer: Reinforcement Learning via Sequence Modeling

Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover,, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch

PDF

Open Access 5 Repos 10 Models 3 Videos

TL;DR

This paper presents Decision Transformer, a novel approach that models reinforcement learning as a sequence prediction task using Transformers, enabling effective offline RL without traditional value functions or policy gradients.

Contribution

It introduces a Transformer-based architecture for RL that conditions on desired return, past states, and actions, simplifying the process and improving performance over existing methods.

Findings

01

Matches or exceeds state-of-the-art offline RL performance

02

Effective on Atari, OpenAI Gym, and Key-to-Door tasks

03

Simplifies RL by framing it as sequence modeling

Abstract

We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. This allows us to draw upon the simplicity and scalability of the Transformer architecture, and associated advances in language modeling such as GPT-x and BERT. In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling. Unlike prior approaches to RL that fit value functions or compute policy gradients, Decision Transformer simply outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired return (reward), past states, and actions, our Decision Transformer model can generate future actions that achieve the desired return. Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline RL baselines on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

Decision Transformer: Reinforcement Learning via Sequence Modeling (Research Paper Explained)· youtube

Daniel Cahn - Slingshot AI (AI Therapy)· youtube

Decision Transformer: Reinforcement Learning via Sequence Modeling· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Topic Modeling · Adversarial Robustness in Machine Learning

MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Label Smoothing · Layer Normalization · Residual Connection · Dense Connections