Disentangled Text Representation Learning with Information-Theoretic   Perspective for Adversarial Robustness

Jiahao Zhao; Wenji Mao

arXiv:2210.14957·cs.CL·October 28, 2022

Disentangled Text Representation Learning with Information-Theoretic Perspective for Adversarial Robustness

Jiahao Zhao, Wenji Mao

PDF

Open Access

TL;DR

This paper proposes a novel disentangled representation learning approach based on information theory to improve adversarial robustness in NLP models by explicitly separating robust and non-robust features.

Contribution

It introduces a mutual information-based disentangled learning framework that explicitly separates robust and non-robust features for enhanced adversarial robustness in NLP.

Findings

01

Significantly outperforms existing methods under adversarial attacks.

02

Effectively disentangles robust and non-robust features.

03

Improves model reliability in NLP tasks.

Abstract

Adversarial vulnerability remains a major obstacle to constructing reliable NLP systems. When imperceptible perturbations are added to raw input text, the performance of a deep learning model may drop dramatically under attacks. Recent work argues the adversarial vulnerability of the model is caused by the non-robust features in supervised training. Thus in this paper, we tackle the adversarial robustness challenge from the view of disentangled representation learning, which is able to explicitly disentangle robust and non-robust features in text. Specifically, inspired by the variation of information (VI) in information theory, we derive a disentangled learning objective composed of mutual information to represent both the semantic representativeness of latent embeddings and differentiation of robust and non-robust features. On the basis of this, we design a disentangled learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning