Improving Robustness and Generality of NLP Models Using Disentangled   Representations

Jiawei Wu; Xiaoya Li; Xiang Ao; Yuxian Meng; Fei Wu; Jiwei Li

arXiv:2009.09587·cs.CL·September 22, 2020·5 cites

Improving Robustness and Generality of NLP Models Using Disentangled Representations

Jiawei Wu, Xiaoya Li, Xiang Ao, Yuxian Meng, Fei Wu, Jiwei Li

PDF

Open Access

TL;DR

This paper introduces a disentangled representation learning approach for NLP models, improving their robustness and domain generalization by mapping inputs to multiple independent representations and combining their predictions.

Contribution

The paper proposes novel methods to enhance NLP model robustness and generality through disentangled representations, including regularization techniques like L2 and Total Correlation within the variational information bottleneck framework.

Findings

01

Models with disentangled representations outperform baseline models in robustness.

02

Disentangled models show improved domain adaptation capabilities.

03

Proposed methods enhance stability against input perturbations.

Abstract

Supervised neural networks, which first map an input $x$ to a single representation $z$ , and then map $z$ to the output label $y$ , have achieved remarkable success in a wide range of natural language processing (NLP) tasks. Despite their success, neural models lack for both robustness and generality: small perturbations to inputs can result in absolutely different outputs; the performance of a model trained on one domain drops drastically when tested on another domain. In this paper, we present methods to improve robustness and generality of NLP models from the standpoint of disentangled representation learning. Instead of mapping $x$ to a single representation $z$ , the proposed strategy maps $x$ to a set of representations ${z_{1}, z_{2}, ..., z_{K}}$ while forcing them to be disentangled. These representations are then mapped to different logits $l$ s, the ensemble of which is used to make…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Adversarial Robustness in Machine Learning