Robust Reinforcement Learning under model misspecification
Lebin Yu, Jian Wang, Xudong Zhang

TL;DR
This paper introduces a new framework for robust reinforcement learning that uses trajectory history and POMDP modeling to handle environment differences, along with an adversarial attack method to improve training robustness.
Contribution
It presents a novel approach combining trajectory history, POMDP modeling, and adversarial attacks to enhance robustness in reinforcement learning under model misspecification.
Findings
Framework effectively handles environment transition differences.
Experimental validation in four gym domains shows improved robustness.
Adversarial attack method aids in training robustness.
Abstract
Reinforcement learning has achieved remarkable performance in a wide range of tasks these days. Nevertheless, some unsolved problems limit its applications in real-world control. One of them is model misspecification, a situation where an agent is trained and deployed in environments with different transition dynamics. We propose an novel framework that utilize history trajectory and Partial Observable Markov Decision Process Modeling to deal with this dilemma. Additionally, we put forward an efficient adversarial attack method to assist robust training. Our experiments in four gym domains validate the effectiveness of our framework.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Smart Grid Security and Resilience
