# Learning Multilingual Word Embeddings Using Image-Text Data

**Authors:** Karan Singhal, Karthik Raman, Balder ten Cate

arXiv: 1905.12260 · 2020-07-02

## TL;DR

This paper explores learning multilingual word embeddings from weakly-supervised image-text data, achieving competitive cross-lingual semantic similarity results without relying on costly labeled datasets.

## Contribution

It introduces methods for multilingual embedding learning using image-text data by aligning image and text representations, bypassing the need for expensive labeled data.

## Key findings

- Bag-of-words embedding model trained on image-text data performs well
- Achieves state-of-the-art performance on cross-lingual tasks
- No reliance on labeled data required

## Abstract

There has been significant interest recently in learning multilingual word embeddings -- in which semantically similar words across languages have similar embeddings. State-of-the-art approaches have relied on expensive labeled data, which is unavailable for low-resource languages, or have involved post-hoc unification of monolingual embeddings. In the present paper, we investigate the efficacy of multilingual embeddings learned from weakly-supervised image-text data. In particular, we propose methods for learning multilingual embeddings using image-text data, by enforcing similarity between the representations of the image and that of the text. Our experiments reveal that even without using any expensive labeled data, a bag-of-words-based embedding model trained on image-text data achieves performance comparable to the state-of-the-art on crosslingual semantic similarity tasks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.12260/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1905.12260/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/1905.12260/full.md

---
Source: https://tomesphere.com/paper/1905.12260