TL;DR
This paper explores query-based black-box adversarial attacks on deep reinforcement learning controllers in cyber-physical systems, demonstrating how such attacks can be formulated as reinforcement learning problems and proposing defenses via adversarial training.
Contribution
It introduces a novel query-based attack formulation as a reinforcement learning problem and proposes adversarial training with transfer learning to improve policy robustness.
Findings
Adversarial policies observing only output generate stronger attacks.
Nominal policies with boundary outputs are more robust.
Adversarial training reduces attack success rate by 50%.
Abstract
Advances in computing resources have resulted in the increasing complexity of cyber-physical systems (CPS). As the complexity of CPS evolved, the focus has shifted from traditional control methods to deep reinforcement learning-based (DRL) methods for control of these systems. This is due to the difficulty of obtaining accurate models of complex CPS for traditional control. However, to securely deploy DRL in production, it is essential to examine the weaknesses of DRL-based controllers (policies) towards malicious attacks from all angles. In this work, we investigate targeted attacks in the action-space domain, also commonly known as actuation attacks in CPS literature, which perturbs the outputs of a controller. We show that a query-based black-box attack model that generates optimal perturbations with respect to an adversarial goal can be formulated as another reinforcement learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsEntropy Regularization · Proximal Policy Optimization
