Enhancing Model Robustness By Incorporating Adversarial Knowledge Into Semantic Representation
Jinfeng Li, Tianyu Du, Xiangyu Liu, Rong Zhang, Hui Xue, Shouling Ji

TL;DR
AdvGraph is a lightweight, task-agnostic defense method that improves the robustness of Chinese NLP models against adversarial attacks by incorporating adversarial knowledge into semantic representations, without sacrificing performance.
Contribution
The paper introduces AdvGraph, a novel approach that enhances Chinese NLP model robustness by embedding adversarial knowledge into semantic representations, applicable across tasks without retraining.
Findings
Significantly improves robustness under adaptive attacks
Maintains performance on legitimate inputs
Lightweight with sub-linear computational complexity
Abstract
Despite that deep neural networks (DNNs) have achieved enormous success in many domains like natural language processing (NLP), they have also been proven to be vulnerable to maliciously generated adversarial examples. Such inherent vulnerability has threatened various real-world deployed DNNs-based applications. To strength the model robustness, several countermeasures have been proposed in the English NLP domain and obtained satisfactory performance. However, due to the unique language properties of Chinese, it is not trivial to extend existing defenses to the Chinese domain. Therefore, we propose AdvGraph, a novel defense which enhances the robustness of Chinese-based NLP models by incorporating adversarial knowledge into the semantic representation of the input. Extensive experiments on two real-world tasks show that AdvGraph exhibits better performance compared with previous work:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Topic Modeling
