Robust Reinforcement Learning under model misspecification

Lebin Yu; Jian Wang; Xudong Zhang

arXiv:2103.15370·cs.LG·March 30, 2021

Robust Reinforcement Learning under model misspecification

Lebin Yu, Jian Wang, Xudong Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new framework for robust reinforcement learning that uses trajectory history and POMDP modeling to handle environment differences, along with an adversarial attack method to improve training robustness.

Contribution

It presents a novel approach combining trajectory history, POMDP modeling, and adversarial attacks to enhance robustness in reinforcement learning under model misspecification.

Findings

01

Framework effectively handles environment transition differences.

02

Experimental validation in four gym domains shows improved robustness.

03

Adversarial attack method aids in training robustness.

Abstract

Reinforcement learning has achieved remarkable performance in a wide range of tasks these days. Nevertheless, some unsolved problems limit its applications in real-world control. One of them is model misspecification, a situation where an agent is trained and deployed in environments with different transition dynamics. We propose an novel framework that utilize history trajectory and Partial Observable Markov Decision Process Modeling to deal with this dilemma. Additionally, we put forward an efficient adversarial attack method to assist robust training. Our experiments in four gym domains validate the effectiveness of our framework.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

PaladinEE15/RSAC
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Smart Grid Security and Resilience