Rethinking Transformers in Solving POMDPs

Chenhao Lu; Ruizhe Shi; Yuyao Liu; Kaizhe Hu; Simon S. Du; and Huazhe Xu

arXiv:2405.17358·cs.LG·May 31, 2024·1 cites

Rethinking Transformers in Solving POMDPs

Chenhao Lu, Ruizhe Shi, Yuyao Liu, Kaizhe Hu, Simon S. Du, and Huazhe Xu

PDF

Open Access 1 Repo

TL;DR

This paper critically examines the limitations of Transformers in solving POMDPs, demonstrating their theoretical shortcomings and proposing a recurrent alternative, the Deep Linear Recurrent Unit, which outperforms Transformers in empirical tests.

Contribution

It reveals the theoretical limitations of Transformers in modeling POMDPs and introduces the Deep Linear Recurrent Unit as a more effective architecture for partially observable RL tasks.

Findings

01

Transformers struggle to model regular languages reducible to POMDPs.

02

Deep Linear Recurrent Units outperform Transformers in empirical evaluations.

03

Transformers lack the recurrence needed for effective POMDP learning.

Abstract

Sequential decision-making algorithms such as reinforcement learning (RL) in real-world scenarios inevitably face environments with partial observability. This paper scrutinizes the effectiveness of a popular architecture, namely Transformers, in Partially Observable Markov Decision Processes (POMDPs) and reveals its theoretical limitations. We establish that regular languages, which Transformers struggle to model, are reducible to POMDPs. This poses a significant challenge for Transformers in learning POMDP-specific inductive biases, due to their lack of inherent recurrence found in other models like RNNs. This paper casts doubt on the prevalent belief in Transformers as sequence models for RL and proposes to introduce a point-wise recurrent structure. The Deep Linear Recurrent Unit (LRU) emerges as a well-suited alternative for Partially Observable RL, with empirical results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ctp314/tfporl
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDNA and Biological Computing

MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention · Dropout · Dense Connections