Learning What to Memorize: Using Intrinsic Motivation to Form Useful   Memory in Partially Observable Reinforcement Learning

Alper Demir

arXiv:2110.12810·cs.LG·February 22, 2023

Learning What to Memorize: Using Intrinsic Motivation to Form Useful Memory in Partially Observable Reinforcement Learning

Alper Demir

PDF

TL;DR

This paper introduces an intrinsic motivation-driven memory mechanism in reinforcement learning, enabling agents to selectively memorize rare observations to better handle partial observability and long-term dependencies.

Contribution

It proposes a novel approach where agents learn to control their memory through intrinsic motivation, improving performance in partially observable environments.

Findings

01

Enhanced disambiguation of environment states

02

Improved performance on long-term dependency tasks

03

Memory control driven by intrinsic motivation

Abstract

Reinforcement Learning faces an important challenge in partial observable environments that has long-term dependencies. In order to learn in an ambiguous environment, an agent has to keep previous perceptions in a memory. Earlier memory based approaches use a fixed method to determine what to keep in the memory, which limits them to certain problems. In this study, we follow the idea of giving the control of the memory to the agent by allowing it to have memory-changing actions. This learning mechanism is supported by an intrinsic motivation to memorize rare observations that can help the agent to disambiguate its state in the environment. Our approach is experimented and analyzed on several partial observable tasks with long-term dependencies and compared with other memory based methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.