Certified Adversarial Robustness for Deep Reinforcement Learning
Bj\"orn L\"utjens, Michael Everett, Jonathan P. How

TL;DR
This paper introduces an online certified defense mechanism for deep reinforcement learning that guarantees robustness against adversarial inputs by computing lower bounds on state-action values, enhancing safety in critical applications.
Contribution
It develops a novel online certified robustness method for deep reinforcement learning that provides formal guarantees against adversarial perturbations during execution.
Findings
Increased robustness to noise and adversarial attacks in collision avoidance tasks.
Effective in classic control scenarios with adversarial perturbations.
Provides formal lower bounds on state-action values during decision-making.
Abstract
Deep Neural Network-based systems are now the state-of-the-art in many robotics tasks, but their application in safety-critical domains remains dangerous without formal guarantees on network robustness. Small perturbations to sensor inputs (from noise or adversarial examples) are often enough to change network-based decisions, which was already shown to cause an autonomous vehicle to swerve into oncoming traffic. In light of these dangers, numerous algorithms have been developed as defensive mechanisms from these adversarial inputs, some of which provide formal robustness guarantees or certificates. This work leverages research on certified adversarial robustness to develop an online certified defense for deep reinforcement learning algorithms. The proposed defense computes guaranteed lower bounds on state-action values during execution to identify and choose the optimal action under a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Physical Unclonable Functions (PUFs) and Hardware Security
