Benchmarking Partial Observability in Reinforcement Learning with a Suite of Memory-Improvable Domains
Ruo Yu Tao, Kaicheng Guo, Cameron Allen, George Konidaris

TL;DR
This paper introduces a comprehensive benchmark suite called POBAX for evaluating reinforcement learning algorithms under various forms of partial observability, emphasizing environments that require memory and are representative of real-world challenges.
Contribution
The paper provides guidelines for benchmarking partial observability, introduces the POBAX library with diverse environments, and demonstrates their memory-improvable nature for robust RL evaluation.
Findings
Environments in POBAX are all memory improvable.
The benchmark covers diverse partial observability scenarios.
Recommended hyperparameters facilitate rapid evaluation.
Abstract
Mitigating partial observability is a necessary but challenging task for general reinforcement learning algorithms. To improve an algorithm's ability to mitigate partial observability, researchers need comprehensive benchmarks to gauge progress. Most algorithms tackling partial observability are only evaluated on benchmarks with simple forms of state aliasing, such as feature masking and Gaussian noise. Such benchmarks do not represent the many forms of partial observability seen in real domains, like visual occlusion or unknown opponent intent. We argue that a partially observable benchmark should have two key properties. The first is coverage in its forms of partial observability, to ensure an algorithm's generalizability. The second is a large gap between the performance of a agents with more or less state information, all other factors roughly equal. This gap implies that an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
