Action Robust Reinforcement Learning and Applications in Continuous Control
Chen Tessler, Yonathan Efroni, Shie Mannor

TL;DR
This paper introduces new robustness criteria for reinforcement learning policies against action uncertainties, develops algorithms for these criteria, and demonstrates their effectiveness and regularization benefits in continuous control tasks.
Contribution
It formalizes two novel robustness criteria for action uncertainty, proposes algorithms for these criteria, and extends them to deep RL with successful experiments in MuJoCo environments.
Findings
Robust policies withstand adversarial action perturbations.
Action robustness improves performance even without perturbations.
Proposed methods act as implicit regularizers in RL.
Abstract
A policy is said to be robust if it maximizes the reward while considering a bad, or even adversarial, model. In this work we formalize two new criteria of robustness to action uncertainty. Specifically, we consider two scenarios in which the agent attempts to perform an action , and (i) with probability , an alternative adversarial action is taken, or (ii) an adversary adds a perturbation to the selected action in the case of continuous action space. We show that our criteria are related to common forms of uncertainty in robotics domains, such as the occurrence of abrupt forces, and suggest algorithms in the tabular case. Building on the suggested algorithms, we generalize our approach to deep reinforcement learning (DRL) and provide extensive experiments in the various MuJoCo domains. Our experiments show that not only does our approach produce robust policies, but…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics
