Design of intentional backdoors in sequential models

Zhaoyuan Yang; Naresh Iyer; Johan Reimann; Nurali Virani

arXiv:1902.09972·cs.CR·March 19, 2019·19 cites

Design of intentional backdoors in sequential models

Zhaoyuan Yang, Naresh Iyer, Johan Reimann, Nurali Virani

PDF

Open Access

TL;DR

This paper introduces novel backdoor attack methods on sequential models like reinforcement learning agents using LSTM networks, demonstrating their effectiveness and discussing potential defenses.

Contribution

It extends backdoor attack techniques to sequential decision-making models, specifically targeting LSTM-based reinforcement learning agents, which was underexplored in prior research.

Findings

01

Effective backdoor attacks demonstrated on grid-world environments

02

Activation of trojan triggers and malicious policies explained

03

Challenges with network size and unintentional triggers identified

Abstract

Recent work has demonstrated robust mechanisms by which attacks can be orchestrated on machine learning models. In contrast to adversarial examples, backdoor or trojan attacks embed surgically modified samples with targeted labels in the model training process to cause the targeted model to learn to misclassify chosen samples in the presence of specific triggers, while keeping the model performance stable across other nominal samples. However, current published research on trojan attacks mainly focuses on classification problems, which ignores sequential dependency between inputs. In this paper, we propose methods to discreetly introduce and exploit novel backdoor attacks within a sequential decision-making agent, such as a reinforcement learning agent, by training multiple benign and malicious policies within a single long short-term memory (LSTM) network. We demonstrate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)