Machine Translation with Cross-lingual Word Embeddings
Marco Berlot, Evan Kaplan

TL;DR
This paper explores the use of cross-lingual word embeddings to create a shared semantic space for multiple languages, enabling better transfer learning and machine translation when data for some languages is limited.
Contribution
It introduces a method for learning a unified word embedding space across languages, facilitating cross-lingual tasks and improving translation quality.
Findings
Cross-lingual embeddings align semantically similar words across languages
Shared representations enable transfer learning in low-resource languages
Potential improvements in machine translation accuracy
Abstract
Learning word embeddings using distributional information is a task that has been studied by many researchers, and a lot of studies are reported in the literature. On the contrary, less studies were done for the case of multiple languages. The idea is to focus on a single representation for a pair of languages such that semantically similar words are closer to one another in the induced representation irrespective of the language. In this way, when data are missing for a particular language, classifiers from another language can be used.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
