Learning What to Memorize: Using Intrinsic Motivation to Form Useful Memory in Partially Observable Reinforcement Learning
Alper Demir

TL;DR
This paper introduces an intrinsic motivation-driven memory mechanism in reinforcement learning, enabling agents to selectively memorize rare observations to better handle partial observability and long-term dependencies.
Contribution
It proposes a novel approach where agents learn to control their memory through intrinsic motivation, improving performance in partially observable environments.
Findings
Enhanced disambiguation of environment states
Improved performance on long-term dependency tasks
Memory control driven by intrinsic motivation
Abstract
Reinforcement Learning faces an important challenge in partial observable environments that has long-term dependencies. In order to learn in an ambiguous environment, an agent has to keep previous perceptions in a memory. Earlier memory based approaches use a fixed method to determine what to keep in the memory, which limits them to certain problems. In this study, we follow the idea of giving the control of the memory to the agent by allowing it to have memory-changing actions. This learning mechanism is supported by an intrinsic motivation to memorize rare observations that can help the agent to disambiguate its state in the environment. Our approach is experimented and analyzed on several partial observable tasks with long-term dependencies and compared with other memory based methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
