Perception-Prediction-Reaction Agents for Deep Reinforcement Learning
Adam Stooke, Valentin Dalibard, Siddhant M. Jayakumar, Wojciech M., Czarnecki, and Max Jaderberg

TL;DR
This paper presents a novel hierarchical recurrent agent architecture with auxiliary losses that enhances reinforcement learning in partially observable environments requiring long-term memory, outperforming baseline models.
Contribution
The paper introduces the Perception-Prediction-Reaction (PPR) agent with a hierarchical structure and auxiliary losses, a new approach for improving long-term memory in reinforcement learning agents.
Findings
PPR outperforms LSTM baseline in DMLab-30 tasks.
Significant improvements in Capture the Flag environment.
Ablation studies confirm the necessity of all components.
Abstract
We introduce a new recurrent agent architecture and associated auxiliary losses which improve reinforcement learning in partially observable tasks requiring long-term memory. We employ a temporal hierarchy, using a slow-ticking recurrent core to allow information to flow more easily over long time spans, and three fast-ticking recurrent cores with connections designed to create an information asymmetry. The \emph{reaction} core incorporates new observations with input from the slow core to produce the agent's policy; the \emph{perception} core accesses only short-term observations and informs the slow core; lastly, the \emph{prediction} core accesses only long-term memory. An auxiliary loss regularizes policies drawn from all three cores against each other, enacting the prior that the policy should be expressible from either recent or long-term memory. We present the resulting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
