Improving Gradient-based Adversarial Training for Text Classification by   Contrastive Learning and Auto-Encoder

Yao Qiu; Jinchao Zhang; Jie Zhou

arXiv:2109.06536·cs.CL·September 15, 2021

Improving Gradient-based Adversarial Training for Text Classification by Contrastive Learning and Auto-Encoder

Yao Qiu, Jinchao Zhang, Jie Zhou

PDF

Open Access

TL;DR

This paper introduces two novel adversarial training methods, CARL and RAR, that enhance the robustness of text classification models against gradient-based adversarial attacks by improving representation learning and reconstruction.

Contribution

The paper proposes two new adversarial training approaches, CARL and RAR, that improve model robustness and efficiency in defending against gradient-based adversarial attacks in text classification.

Findings

01

Both approaches outperform strong baselines on various datasets.

02

Semantic representations are less affected by adversarial perturbations.

03

RAR can generate text-form adversarial samples.

Abstract

Recent work has proposed several efficient approaches for generating gradient-based adversarial perturbations on embeddings and proved that the model's performance and robustness can be improved when they are trained with these contaminated embeddings. While they paid little attention to how to help the model to learn these adversarial samples more efficiently. In this work, we focus on enhancing the model's ability to defend gradient-based adversarial attack during the model's training process and propose two novel adversarial training approaches: (1) CARL narrows the original sample and its adversarial sample in the representation space while enlarging their distance from different labeled samples. (2) RAR forces the model to reconstruct the original sample from its adversarial representation. Experiments show that the proposed two approaches outperform strong baselines on various…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling