Mitigating backdoor attacks in LSTM-based Text Classification Systems by   Backdoor Keyword Identification

Chuanshuai Chen; Jiazhu Dai

arXiv:2007.12070·cs.CR·March 16, 2021·6 cites

Mitigating backdoor attacks in LSTM-based Text Classification Systems by Backdoor Keyword Identification

Chuanshuai Chen, Jiazhu Dai

PDF

Open Access

TL;DR

This paper introduces Backdoor Keyword Identification (BKI), a novel defense method to detect and exclude poisoned data samples in LSTM-based text classification models, effectively mitigating backdoor attacks.

Contribution

The paper presents a new defense technique specifically for RNN backdoor attacks in text classification, analyzing LSTM neuron changes to identify malicious training data.

Findings

01

Effective detection of poisoned samples across four datasets

02

High accuracy in mitigating backdoor triggers

03

Applicable without a trusted dataset

Abstract

It has been proved that deep neural networks are facing a new threat called backdoor attacks, where the adversary can inject backdoors into the neural network model through poisoning the training dataset. When the input containing some special pattern called the backdoor trigger, the model with backdoor will carry out malicious task such as misclassification specified by adversaries. In text classification systems, backdoors inserted in the models can cause spam or malicious speech to escape detection. Previous work mainly focused on the defense of backdoor attacks in computer vision, little attention has been paid to defense method for RNN backdoor attacks regarding text classification. In this paper, through analyzing the changes in inner LSTM neurons, we proposed a defense method called Backdoor Keyword Identification (BKI) to mitigate backdoor attacks which the adversary performs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Advanced Malware Detection Techniques

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory