Detecting Adversarial Attacks on Neural Network Policies with Visual   Foresight

Yen-Chen Lin; Ming-Yu Liu; Min Sun; Jia-Bin Huang

arXiv:1710.00814·cs.CV·October 4, 2017·34 cites

Detecting Adversarial Attacks on Neural Network Policies with Visual Foresight

Yen-Chen Lin, Ming-Yu Liu, Min Sun, Jia-Bin Huang

PDF

Open Access 2 Repos

TL;DR

This paper introduces a defense mechanism for reinforcement learning agents that uses frame prediction to detect and mitigate adversarial attacks, enhancing safety in critical systems like autonomous vehicles.

Contribution

The paper proposes a novel detection method leveraging action-conditioned frame prediction to identify adversarial examples in neural network policies.

Findings

01

Effective detection of adversarial attacks in Atari games

02

Improved reward retention under attack scenarios

03

Outperforms baseline detection algorithms

Abstract

Deep reinforcement learning has shown promising results in learning control policies for complex sequential decision-making tasks. However, these neural network-based policies are known to be vulnerable to adversarial examples. This vulnerability poses a potentially serious threat to safety-critical systems such as autonomous vehicles. In this paper, we propose a defense mechanism to defend reinforcement learning agents from adversarial attacks by leveraging an action-conditioned frame prediction module. Our core idea is that the adversarial examples targeting at a neural network-based policy are not effective for the frame prediction model. By comparing the action distribution produced by a policy from processing the current observed frame to the action distribution produced by the same policy from processing the predicted frame from the action-conditioned frame prediction module, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Physical Unclonable Functions (PUFs) and Hardware Security