Prioritized Sweeping Neural DynaQ with Multiple Predecessors, and   Hippocampal Replays

Lise Aubin; Mehdi Khamassi (ISIR); Beno\^it Girard (ISIR)

arXiv:1802.05594·cs.AI·August 14, 2018

Prioritized Sweeping Neural DynaQ with Multiple Predecessors, and Hippocampal Replays

Lise Aubin, Mehdi Khamassi (ISIR), Beno\^it Girard (ISIR)

PDF

TL;DR

This paper introduces a neural network-based prioritized sweeping Q-learning algorithm with multiple predecessors, modeling hippocampal replays during rest to enhance learning in navigation tasks, aligning with experimental observations.

Contribution

It presents a novel neural network implementation of prioritized sweeping Q-learning with multiple predecessors, explaining hippocampal replays in reinforcement learning.

Findings

01

Improved learning performance in simulated navigation tasks.

02

Predicted replays during rest should be shuffled in animals.

03

Model suggests rest periods are crucial for world model learning.

Abstract

During sleep and awake rest, the hippocampus replays sequences of place cells that have been activated during prior experiences. These have been interpreted as a memory consolidation process, but recent results suggest a possible interpretation in terms of reinforcement learning. The Dyna reinforcement learning algorithms use off-line replays to improve learning. Under limited replay budget, a prioritized sweeping approach, which requires a model of the transitions to the predecessors, can be used to improve performance. We investigate whether such algorithms can explain the experimentally observed replays. We propose a neural network version of prioritized sweeping Q-learning, for which we developed a growing multiple expert algorithm, able to cope with multiple predecessors. The resulting architecture is able to improve the learning of simulated agents confronted to a navigation task.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPrioritized Sweeping