AdaMemento: Adaptive Memory-Assisted Policy Optimization for   Reinforcement Learning

Renye Yan; Yaozhong Gan; You Wu; Junliang Xing; Ling Liangn; Yeshang; Zhu; Yimao Cai

arXiv:2410.04498·cs.LG·October 8, 2024

AdaMemento: Adaptive Memory-Assisted Policy Optimization for Reinforcement Learning

Renye Yan, Yaozhong Gan, You Wu, Junliang Xing, Ling Liangn, Yeshang, Zhu, Yimao Cai

PDF

Open Access

TL;DR

AdaMemento introduces an adaptive memory framework for reinforcement learning that leverages both positive and negative experiences, guided by intrinsic motivation and ensemble learning, to improve exploration and policy optimization in sparse reward environments.

Contribution

It proposes a novel memory-reflection module and intrinsic motivation paradigm, enhancing experience filtering and exploration in RL beyond simple memory storage.

Findings

01

Achieves significant performance improvements over previous methods.

02

Effectively distinguishes subtle states for better exploration.

03

Theoretically proven advantages of intrinsic motivation and ensemble mechanisms.

Abstract

In sparse reward scenarios of reinforcement learning (RL), the memory mechanism provides promising shortcuts to policy optimization by reflecting on past experiences like humans. However, current memory-based RL methods simply store and reuse high-value policies, lacking a deeper refining and filtering of diverse past experiences and hence limiting the capability of memory. In this paper, we propose AdaMemento, an adaptive memory-enhanced RL framework. Instead of just memorizing positive past experiences, we design a memory-reflection module that exploits both positive and negative experiences by learning to predict known local optimal policies based on real-time states. To effectively gather informative trajectories for the memory, we further introduce a fine-grained intrinsic motivation paradigm, where nuances in similar states can be precisely distinguished to guide exploration. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics