Enhancing Model Robustness By Incorporating Adversarial Knowledge Into   Semantic Representation

Jinfeng Li; Tianyu Du; Xiangyu Liu; Rong Zhang; Hui Xue; Shouling Ji

arXiv:2102.11584·cs.CL·February 24, 2021

Enhancing Model Robustness By Incorporating Adversarial Knowledge Into Semantic Representation

Jinfeng Li, Tianyu Du, Xiangyu Liu, Rong Zhang, Hui Xue, Shouling Ji

PDF

Open Access

TL;DR

AdvGraph is a lightweight, task-agnostic defense method that improves the robustness of Chinese NLP models against adversarial attacks by incorporating adversarial knowledge into semantic representations, without sacrificing performance.

Contribution

The paper introduces AdvGraph, a novel approach that enhances Chinese NLP model robustness by embedding adversarial knowledge into semantic representations, applicable across tasks without retraining.

Findings

01

Significantly improves robustness under adaptive attacks

02

Maintains performance on legitimate inputs

03

Lightweight with sub-linear computational complexity

Abstract

Despite that deep neural networks (DNNs) have achieved enormous success in many domains like natural language processing (NLP), they have also been proven to be vulnerable to maliciously generated adversarial examples. Such inherent vulnerability has threatened various real-world deployed DNNs-based applications. To strength the model robustness, several countermeasures have been proposed in the English NLP domain and obtained satisfactory performance. However, due to the unique language properties of Chinese, it is not trivial to extend existing defenses to the Chinese domain. Therefore, we propose AdvGraph, a novel defense which enhances the robustness of Chinese-based NLP models by incorporating adversarial knowledge into the semantic representation of the input. Extensive experiments on two real-world tasks show that AdvGraph exhibits better performance compared with previous work:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling