PIGDreamer: Privileged Information Guided World Models for Safe Partially Observable Reinforcement Learning
Dongchi Huang, Jiaqi Wang, Yang Li, Chunhe Xia, Tianle Zhang, Kaige Zhang

TL;DR
PIGDreamer introduces a novel model-based reinforcement learning approach that uses privileged information to improve safety and performance in partially observable environments, backed by theoretical analysis and empirical validation.
Contribution
It proposes ACPOMDPs for theoretical insights and PIGDreamer for leveraging privileged information to enhance safe RL, outperforming existing methods.
Findings
PIGDreamer significantly outperforms existing Safe RL methods.
The approach demonstrates improved robustness and efficiency.
Empirical results validate the benefits of privileged information in safe RL.
Abstract
Partial observability presents a significant challenge for Safe Reinforcement Learning (Safe RL), as it impedes the identification of potential risks and rewards. Leveraging specific types of privileged information during training to mitigate the effects of partial observability has yielded notable empirical successes. In this paper, we propose Asymmetric Constrained Partially Observable Markov Decision Processes (ACPOMDPs) to theoretically examine the advantages of incorporating privileged information in Safe RL. Building upon ACPOMDPs, we propose the Privileged Information Guided Dreamer (PIGDreamer), a model-based RL approach that leverages privileged information to enhance the agent's safety and performance through privileged representation alignment and an asymmetric actor-critic structure. Our empirical results demonstrate that PIGDreamer significantly outperforms existing Safe RL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
