Recurrent Action Transformer with Memory
Egor Cherepanov, Alexey Staroverov, Alexey K. Kovalev, Aleksandr I. Panov

TL;DR
This paper introduces RATE, a transformer-based offline RL architecture with recurrent memory, enhancing decision-making in memory-dependent, partially observable environments while maintaining competitiveness on standard benchmarks.
Contribution
We propose RATE, a novel transformer architecture with integrated recurrent memory, improving long-term information retention in offline RL tasks.
Findings
RATE outperforms baselines in memory-intensive environments
RATE maintains competitive performance on standard benchmarks
Memory mechanisms are crucial for effective offline RL in POMDPs
Abstract
Transformers have become increasingly popular in offline reinforcement learning (RL) due to their ability to treat agent trajectories as sequences, reframing policy learning as a sequence modeling task. However, in partially observable environments (POMDPs), effective decision-making depends on retaining information about past events -- something that standard transformers struggle with due to the quadratic complexity of self-attention, which limits their context length. One solution to this problem is to extend transformers with memory mechanisms. We propose the Recurrent Action Transformer with Memory (RATE), a novel transformer-based architecture for offline RL that incorporates a recurrent memory mechanism designed to regulate information retention. We evaluate RATE across a diverse set of environments: memory-intensive tasks (ViZDoom-Two-Colors, T-Maze, Memory Maze,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Absolute Position Encodings · Layer Normalization · Byte Pair Encoding · Residual Connection · Softmax
