Synthetic POMDPs to Challenge Memory-Augmented RL: Memory Demand Structure Modeling
Yongyi Wang, Lingfeng Li, Bozhou Chen, Ang Li, Hanyu Liu, Qirui Zheng, Xionghui Yang, Wenxin Li

TL;DR
This paper introduces a theoretical framework and methodology for designing customizable, scalable POMDP environments with tunable memory challenges to improve evaluation of memory-augmented RL agents.
Contribution
It presents a new theoretical framework based on Memory Demand Structure and a construction methodology for creating POMDPs with predefined memory challenges.
Findings
Developed a theoretical framework for analyzing POMDPs based on MDS.
Created a methodology to construct POMDPs with specific memory demands.
Provided a suite of scalable POMDP environments with adjustable difficulty.
Abstract
Recent benchmarks for memory-augmented reinforcement learning (RL) have introduced partially observable Markov decision process (POMDP) environments in which agents must use historical observations to make decisions. However, these benchmarks often lack fine-grained control over the challenges posed to memory models. Synthetic environments offer a solution, enabling precise manipulation of environment dynamics for rigorous and interpretable evaluation of memory-augmented RL. This paper advances the design of such customizable POMDPs with three key contributions: (1) a theoretical framework for analyzing POMDPs based on Memory Demand Structure (MDS) and related concepts; (2) a methodology using linear dynamics, state aggregation, and reward redistribution to construct POMDPs with predefined MDS; and (3) a suite of lightweight, scalable POMDP environments with tunable difficulty, grounded…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
