Working Memory Graphs
Ricky Loynd, Roland Fernandez, Asli Celikyilmaz, Adith Swaminathan and, Matthew Hausknecht

TL;DR
The paper introduces the Working Memory Graph, a Transformer-based agent that enhances sequential decision-making by reasoning over dynamic observed and recurrent states, leading to improved learning efficiency in factored observation environments.
Contribution
It presents the Working Memory Graph, a novel Transformer-based architecture for RL agents that effectively utilizes factored observations to improve sample efficiency.
Findings
WMG outperforms baselines in Pathfinding, BabyAI, and Sokoban environments.
Transformer architecture with factored observations boosts learning efficiency.
Significant sample efficiency gains demonstrated across multiple tasks.
Abstract
Transformers have increasingly outperformed gated RNNs in obtaining new state-of-the-art results on supervised tasks involving text sequences. Inspired by this trend, we study the question of how Transformer-based models can improve the performance of sequential decision-making agents. We present the Working Memory Graph (WMG), an agent that employs multi-head self-attention to reason over a dynamic set of vectors representing observed and recurrent state. We evaluate WMG in three environments featuring factored observation spaces: a Pathfinding environment that requires complex reasoning over past observations, BabyAI gridworld levels that involve variable goals, and Sokoban which emphasizes future planning. We find that the combination of WMG's Transformer-based architecture with factored observation spaces leads to significant gains in learning efficiency compared to baseline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Data Quality and Management
