Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents
Marco Pleines, Matthias Pallasch, Frank Zimmer, Mike Preuss

TL;DR
Memory Gym introduces a suite of environments to benchmark agent memory in dynamic, endless tasks, and evaluates Transformer-XL and GRU models, revealing that GRU excels in prolonged scenarios.
Contribution
The paper develops new endless memory tasks, implements Transformer-XL with PPO for benchmarking, and compares its performance to GRU in both finite and infinite environments.
Findings
Transformer-XL outperforms GRU in finite tasks with auxiliary loss.
GRU significantly outperforms Transformer-XL in endless tasks.
The environments effectively challenge and measure memory capabilities of agents.
Abstract
Memory Gym presents a suite of 2D partially observable environments, namely Mortar Mayhem, Mystery Path, and Searing Spotlights, designed to benchmark memory capabilities in decision-making agents. These environments, originally with finite tasks, are expanded into innovative, endless formats, mirroring the escalating challenges of cumulative memory games such as "I packed my bag". This progression in task design shifts the focus from merely assessing sample efficiency to also probing the levels of memory effectiveness in dynamic, prolonged scenarios. To address the gap in available memory-based Deep Reinforcement Learning baselines, we introduce an implementation within the open-source CleanRL library that integrates Transformer-XL (TrXL) with Proximal Policy Optimization. This approach utilizes TrXL as a form of episodic memory, employing a sliding window technique. Our comparative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Multi-Head Attention · Attention Is All You Need · Focus · Softmax · Cosine Annealing · Linear Layer · Variational Dropout · Adam · Residual Connection
