Natural Actor-Critic for Robust Reinforcement Learning with Function Approximation
Ruida Zhou, Tao Liu, Min Cheng, Dileep Kalathil, P. R. Kumar, Chao, Tian

TL;DR
This paper introduces a scalable robust reinforcement learning method using natural actor-critic with novel uncertainty sets, ensuring policies are resilient to model mismatch and demonstrated in simulation and real-world tasks.
Contribution
It proposes two new uncertainty set formulations and a robust natural actor-critic algorithm with convergence guarantees for large-scale problems.
Findings
The RNAC algorithm converges to the optimal robust policy within function approximation error.
The proposed method outperforms baseline approaches in MuJoCo environments.
Robust policies learned improve real-world TurtleBot navigation performance.
Abstract
We study robust reinforcement learning (RL) with the goal of determining a well-performing policy that is robust against model mismatch between the training simulator and the testing environment. Previous policy-based robust RL algorithms mainly focus on the tabular setting under uncertainty sets that facilitate robust policy evaluation, but are no longer tractable when the number of states scales up. To this end, we propose two novel uncertainty set formulations, one based on double sampling and the other on an integral probability metric. Both make large-scale robust RL tractable even when one only has access to a simulator. We propose a robust natural actor-critic (RNAC) approach that incorporates the new uncertainty sets and employs function approximation. We provide finite-time convergence guarantees for the proposed RNAC algorithm to the optimal robust policy within the function…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Adversarial Robustness in Machine Learning
MethodsFocus
