Adversarial Learning with Contextual Embeddings for Zero-resource   Cross-lingual Classification and NER

Phillip Keung; Yichao Lu; Vikas Bhardwaj

arXiv:1909.00153·cs.CL·March 23, 2020·5 cites

Adversarial Learning with Contextual Embeddings for Zero-resource Cross-lingual Classification and NER

Phillip Keung, Yichao Lu, Vikas Bhardwaj

PDF

Open Access

TL;DR

This paper enhances zero-resource cross-lingual NLP tasks by applying adversarial learning to multilingual BERT, improving performance and aligning embeddings across languages for classification and NER tasks.

Contribution

It introduces adversarial training to multilingual BERT, significantly improving zero-resource cross-lingual classification and NER performance and promoting embedding alignment across languages.

Findings

01

Performance improvements on MLDoc and CoNLL datasets

02

Adversarial training aligns English and translated document embeddings

03

Language-adversarial training boosts cross-lingual transfer

Abstract

Contextual word embeddings (e.g. GPT, BERT, ELMo, etc.) have demonstrated state-of-the-art performance on various NLP tasks. Recent work with the multilingual version of BERT has shown that the model performs very well in zero-shot and zero-resource cross-lingual settings, where only labeled English data is used to finetune the model. We improve upon multilingual BERT's zero-resource cross-lingual performance via adversarial learning. We report the magnitude of the improvement on the multilingual MLDoc text classification and CoNLL 2002/2003 named entity recognition tasks. Furthermore, we show that language-adversarial training encourages BERT to align the embeddings of English documents and their translations, which may be the cause of the observed performance gains.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsLinear Layer · Cosine Annealing · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Bidirectional LSTM · Discriminative Fine-Tuning · Linear Warmup With Cosine Annealing · Byte Pair Encoding · GPT