Natural Language Adversarial Defense through Synonym Encoding

Xiaosen Wang; Hao Jin; Yichen Yang; Kun He

arXiv:1909.06723·cs.CL·June 16, 2021·33 cites

Natural Language Adversarial Defense through Synonym Encoding

Xiaosen Wang, Hao Jin, Yichen Yang, Kun He

PDF

Open Access 1 Repo

TL;DR

This paper introduces SEM, a novel defense method against synonym substitution attacks in NLP, by encoding synonyms to improve model robustness without altering architecture or requiring extra data.

Contribution

The paper proposes SEM, a new synonym encoding defense technique that effectively counters synonym-based adversarial attacks in NLP models.

Findings

01

SEM defends against current synonym substitution attacks

02

SEM blocks transferability of adversarial examples

03

SEM scales efficiently to large models and datasets

Abstract

In the area of natural language processing, deep learning models are recently known to be vulnerable to various types of adversarial perturbations, but relatively few works are done on the defense side. Especially, there exists few effective defense method against the successful synonym substitution based attacks that preserve the syntactic structure and semantic information of the original text while fooling the deep learning models. We contribute in this direction and propose a novel adversarial defense method called Synonym Encoding Method (SEM). Specifically, SEM inserts an encoder before the input layer of the target model to map each cluster of synonyms to a unique encoding and trains the model to eliminate possible adversarial perturbations without modifying the network architecture or adding extra data. Extensive experiments demonstrate that SEM can effectively defend the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xiaosen-wang/SEM
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Hate Speech and Cyberbullying Detection · Advanced Malware Detection Techniques