Cross-lingual alignments of ELMo contextual embeddings

Matej Ul\v{c}ar; Marko Robnik-\v{S}ikonja

arXiv:2106.15986·cs.CL·June 1, 2022

Cross-lingual alignments of ELMo contextual embeddings

Matej Ul\v{c}ar, Marko Robnik-\v{S}ikonja

PDF

TL;DR

This paper introduces novel methods for aligning ELMo contextual embeddings across languages, enabling cross-lingual transfer for NLP tasks, with promising results on multiple languages and tasks.

Contribution

It proposes new linear and nonlinear (GAN-based) methods for cross-lingual alignment of ELMo embeddings, addressing the challenge of creating anchor points in context.

Findings

01

ELMoGAN performs well on NER and terminology tasks.

02

Linear methods are more effective for dependency parsing and sentiment analysis.

03

Cross-lingual alignment reduces the need for extensive training data in low-resource languages.

Abstract

Building machine learning prediction models for a specific NLP task requires sufficient training data, which can be difficult to obtain for less-resourced languages. Cross-lingual embeddings map word embeddings from a less-resourced language to a resource-rich language so that a prediction model trained on data from the resource-rich language can also be used in the less-resourced language. To produce cross-lingual mappings of recent contextual embeddings, anchor points between the embedding spaces have to be words in the same context. We address this issue with a novel method for creating cross-lingual contextual alignment datasets. Based on that, we propose several cross-lingual mapping methods for ELMo embeddings. The proposed linear mapping methods use existing Vecmap and MUSE alignments on contextual ELMo embeddings. Novel nonlinear ELMoGAN mapping methods are based on GANs and do…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Bidirectional LSTM · Softmax · ELMo