A Comparison of Architectures and Pretraining Methods for Contextualized   Multilingual Word Embeddings

Niels van der Heijden; Samira Abnar; Ekaterina Shutova

arXiv:1912.10169·cs.CL·December 24, 2019

A Comparison of Architectures and Pretraining Methods for Contextualized Multilingual Word Embeddings

Niels van der Heijden, Samira Abnar, Ekaterina Shutova

PDF

TL;DR

This paper compares various multilingual NLP models and introduces a new method for creating contextualized embeddings that improve zero-shot transfer and cross-language knowledge sharing.

Contribution

It provides a comprehensive comparison of multilingual encoders and proposes a novel method that enhances zero-shot transfer and multilingual learning.

Findings

01

Our method performs at or above state-of-the-art in zero-shot transfer.

02

It enables better knowledge sharing across languages.

03

The comparison highlights strengths and weaknesses of existing models.

Abstract

The lack of annotated data in many languages is a well-known challenge within the field of multilingual natural language processing (NLP). Therefore, many recent studies focus on zero-shot transfer learning and joint training across languages to overcome data scarcity for low-resource languages. In this work we (i) perform a comprehensive comparison of state-ofthe-art multilingual word and sentence encoders on the tasks of named entity recognition (NER) and part of speech (POS) tagging; and (ii) propose a new method for creating multilingual contextualized word embeddings, compare it to multiple baselines and show that it performs at or above state-of-theart level in zero-shot transfer settings. Finally, we show that our method allows for better knowledge sharing across languages in a joint training setting.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.