# Multilingual Factor Analysis

**Authors:** Francisco Vargas, Kamen Brestnichki, Alex Papadopoulos-Korfiatis and, Nils Hammerla

arXiv: 1905.05547 · 2019-10-25

## TL;DR

This paper introduces a generative latent variable model for learning multilingual word representations from dictionaries, enabling robust alignment across languages and performing well despite noisy data.

## Contribution

It presents a novel offline approach using a generative model to align multilingual embeddings, improving robustness and alignment quality.

## Key findings

- Achieves competitive results on multilingual tasks
- Robust to noise in embedding space
- Effective for distributed representations from noisy corpora

## Abstract

In this work we approach the task of learning multilingual word representations in an offline manner by fitting a generative latent variable model to a multilingual dictionary. We model equivalent words in different languages as different views of the same word generated by a common latent variable representing their latent lexical meaning. We explore the task of alignment by querying the fitted model for multilingual embeddings achieving competitive results across a variety of tasks. The proposed model is robust to noise in the embedding space making it a suitable method for distributed representations learned from noisy corpora.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.05547/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1905.05547/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/1905.05547/full.md

---
Source: https://tomesphere.com/paper/1905.05547