Should We Ever Prefer Decision Transformer for Offline Reinforcement Learning?

Yumi Omori; Zixuan Dong; Keith Ross

arXiv:2507.10174·cs.AI·July 15, 2025

Should We Ever Prefer Decision Transformer for Offline Reinforcement Learning?

Yumi Omori, Zixuan Dong, Keith Ross

PDF

Open Access

TL;DR

This paper compares Decision Transformer with simpler methods like Filtered Behavior Cloning in offline reinforcement learning, finding that the simpler approach often performs better in various environments, questioning the universal preference for DT.

Contribution

The study demonstrates that a straightforward filtering-based behavior cloning method can outperform Decision Transformer in offline RL tasks, challenging the assumption of DT's superiority.

Findings

01

FBC achieves competitive or better results than DT in sparse-reward tasks.

02

FBC is simpler, requires less data, and is more computationally efficient.

03

DT is not necessarily preferable for offline RL in the tested environments.

Abstract

In recent years, extensive work has explored the application of the Transformer architecture to reinforcement learning problems. Among these, Decision Transformer (DT) has gained particular attention in the context of offline reinforcement learning due to its ability to frame return-conditioned policy learning as a sequence modeling task. Most recently, Bhargava et al. (2024) provided a systematic comparison of DT with more conventional MLP-based offline RL algorithms, including Behavior Cloning (BC) and Conservative Q-Learning (CQL), and claimed that DT exhibits superior performance in sparse-reward and low-quality data settings. In this paper, through experimentation on robotic manipulation tasks (Robomimic) and locomotion benchmarks (D4RL), we show that MLP-based Filtered Behavior Cloning (FBC) achieves competitive or superior performance compared to DT in sparse-reward…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Systems and Decision Making