Decision Transformer: Reinforcement Learning via Sequence Modeling
Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover,, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch

TL;DR
This paper presents Decision Transformer, a novel approach that models reinforcement learning as a sequence prediction task using Transformers, enabling effective offline RL without traditional value functions or policy gradients.
Contribution
It introduces a Transformer-based architecture for RL that conditions on desired return, past states, and actions, simplifying the process and improving performance over existing methods.
Findings
Matches or exceeds state-of-the-art offline RL performance
Effective on Atari, OpenAI Gym, and Key-to-Door tasks
Simplifies RL by framing it as sequence modeling
Abstract
We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. This allows us to draw upon the simplicity and scalability of the Transformer architecture, and associated advances in language modeling such as GPT-x and BERT. In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling. Unlike prior approaches to RL that fit value functions or compute policy gradients, Decision Transformer simply outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired return (reward), past states, and actions, our Decision Transformer model can generate future actions that achieve the desired return. Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline RL baselines on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗edbeeching/decision-transformer-gym-halfcheetah-expertmodel· 70 dl· ♡ 170 dl♡ 1
- 🤗edbeeching/decision-transformer-gym-halfcheetah-mediummodel· 91 dl· ♡ 191 dl♡ 1
- 🤗edbeeching/decision-transformer-gym-halfcheetah-medium-replaymodel· 44 dl44 dl
- 🤗edbeeching/decision-transformer-gym-hopper-expertmodel· 429 dl· ♡ 19429 dl♡ 19
- 🤗edbeeching/decision-transformer-gym-hopper-mediummodel· 1.8k dl· ♡ 71.8k dl♡ 7
- 🤗edbeeching/decision-transformer-gym-hopper-medium-replaymodel· 40 dl40 dl
- 🤗edbeeching/decision-transformer-gym-walker2d-expertmodel· 55 dl· ♡ 455 dl♡ 4
- 🤗edbeeching/decision-transformer-gym-walker2d-mediummodel· 55 dl· ♡ 155 dl♡ 1
- 🤗edbeeching/decision-transformer-gym-walker2d-medium-replaymodel· 23 dl· ♡ 123 dl♡ 1
- 🤗RamAnanth1/decision_transformers_half_cheetahmodel· 3 dl3 dl
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Topic Modeling · Adversarial Robustness in Machine Learning
MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Label Smoothing · Layer Normalization · Residual Connection · Dense Connections
