Generalizable Episodic Memory for Deep Reinforcement Learning
Hao Hu, Jianing Ye, Guangxiang Zhu, Zhizhou Ren, Chongjie Zhang

TL;DR
This paper introduces GEM, a novel episodic memory method for deep reinforcement learning that improves sample efficiency and performance in continuous and discrete domains by organizing experience for implicit planning.
Contribution
GEM is a new episodic memory approach that effectively organizes state-action values for better experience aggregation and planning in continuous and discrete RL tasks.
Findings
GEM outperforms existing methods on MuJoCo tasks.
GEM shows significant improvements on Atari games.
The method reduces overestimation bias in value propagation.
Abstract
Episodic memory-based methods can rapidly latch onto past successful strategies by a non-parametric memory and improve sample efficiency of traditional reinforcement learning. However, little effort is put into the continuous domain, where a state is never visited twice, and previous episodic methods fail to efficiently aggregate experience across trajectories. To address this problem, we propose Generalizable Episodic Memory (GEM), which effectively organizes the state-action values of episodic memory in a generalizable manner and supports implicit planning on memorized trajectories. GEM utilizes a double estimator to reduce the overestimation bias induced by value propagation in the planning process. Empirical evaluation shows that our method significantly outperforms existing trajectory-based methods on various MuJoCo continuous control tasks. To further show the general…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Artificial Intelligence in Games
