Unsupervised Predictive Memory in a Goal-Directed Agent
Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja,, Agnieszka Grabska-Barwinska, Jack Rae, Piotr Mirowski, Joel Z. Leibo, Adam, Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harley, Josh Abramson, Shakir, Mohamed, Danilo Rezende, David Saxton, Adam Cain

TL;DR
This paper introduces MERLIN, a memory-augmented reinforcement learning model that effectively handles partial observability in complex 3D environments by using predictive modeling to guide memory formation.
Contribution
The paper presents MERLIN, a novel AI architecture that combines memory, reinforcement learning, and inference, enabling agents to solve complex, partially observable tasks without simplifying assumptions.
Findings
MERLIN outperforms traditional RL algorithms in partially observable 3D tasks.
Memory guided by predictive modeling improves long-term task performance.
The model operates effectively without assumptions on sensory input dimensionality.
Abstract
Animals execute goal-directed behaviours despite the limited range and scope of their sensors. To cope, they explore environments and store memories maintaining estimates of important information that is not presently available. Recently, progress has been made with artificial intelligence (AI) agents that learn to perform tasks from sensory input, even at a human level, by merging reinforcement learning (RL) algorithms with deep neural networks, and the excitement surrounding these results has led to the pursuit of related ideas as explanations of non-human animal learning. However, we demonstrate that contemporary RL algorithms struggle to solve simple tasks when enough information is concealed from the sensors of the agent, a property called "partial observability". An obvious requirement for handling partially observed tasks is access to extensive memory, but we show memory is not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Neural dynamics and brain function · Explainable Artificial Intelligence (XAI)
