Perception-Prediction-Reaction Agents for Deep Reinforcement Learning

Adam Stooke; Valentin Dalibard; Siddhant M. Jayakumar; Wojciech M.; Czarnecki; and Max Jaderberg

arXiv:2006.15223·cs.AI·June 30, 2020·1 cites

Perception-Prediction-Reaction Agents for Deep Reinforcement Learning

Adam Stooke, Valentin Dalibard, Siddhant M. Jayakumar, Wojciech M., Czarnecki, and Max Jaderberg

PDF

Open Access

TL;DR

This paper presents a novel hierarchical recurrent agent architecture with auxiliary losses that enhances reinforcement learning in partially observable environments requiring long-term memory, outperforming baseline models.

Contribution

The paper introduces the Perception-Prediction-Reaction (PPR) agent with a hierarchical structure and auxiliary losses, a new approach for improving long-term memory in reinforcement learning agents.

Findings

01

PPR outperforms LSTM baseline in DMLab-30 tasks.

02

Significant improvements in Capture the Flag environment.

03

Ablation studies confirm the necessity of all components.

Abstract

We introduce a new recurrent agent architecture and associated auxiliary losses which improve reinforcement learning in partially observable tasks requiring long-term memory. We employ a temporal hierarchy, using a slow-ticking recurrent core to allow information to flow more easily over long time spans, and three fast-ticking recurrent cores with connections designed to create an information asymmetry. The \emph{reaction} core incorporates new observations with input from the slow core to produce the agent's policy; the \emph{perception} core accesses only short-term observations and informs the slow core; lastly, the \emph{prediction} core accesses only long-term memory. An auxiliary loss regularizes policies drawn from all three cores against each other, enacting the prior that the policy should be expressible from either recent or long-term memory. We present the resulting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)