Online Robust Policy Learning in the Presence of Unknown Adversaries
Aaron J. Havens, Zhanhong Jiang, Soumik Sarkar

TL;DR
This paper proposes MLAH, an online, attack model-agnostic framework for robust reinforcement learning that mitigates adversarial bias by learning separate policies guided by a supervisory agent, enhancing safety in cyber-physical systems.
Contribution
Introduces MLAH, a novel meta-learned hierarchy framework that handles attacks in decision space and mitigates bias in reinforcement learning under adversarial conditions.
Findings
Enables policy learning with lower bias under attack.
Effective in cyber-physical system scenarios.
Outperforms state-of-the-art approaches in simulations.
Abstract
The growing prospect of deep reinforcement learning (DRL) being used in cyber-physical systems has raised concerns around safety and robustness of autonomous agents. Recent work on generating adversarial attacks have shown that it is computationally feasible for a bad actor to fool a DRL policy into behaving sub optimally. Although certain adversarial attacks with specific attack models have been addressed, most studies are only interested in off-line optimization in the data space (e.g., example fitting, distillation). This paper introduces a Meta-Learned Advantage Hierarchy (MLAH) framework that is attack model-agnostic and more suited to reinforcement learning, via handling the attacks in the decision space (as opposed to data space) and directly mitigating learned bias introduced by the adversary. In MLAH, we learn separate sub-policies (nominal and adversarial) in an online manner,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Electrostatic Discharge in Electronics
