Exploration on Grounded Word Embedding: Matching Words and Images with   Image-Enhanced Skip-Gram Model

Ruixuan Luo

arXiv:1809.02765·cs.CL·September 11, 2018

Exploration on Grounded Word Embedding: Matching Words and Images with Image-Enhanced Skip-Gram Model

Ruixuan Luo

PDF

Open Access

TL;DR

This paper introduces an Image-Enhanced Skip-Gram Model that learns grounded word embeddings by aligning them with image vectors, providing more interpretable and visually explainable word representations.

Contribution

The paper proposes a novel model that integrates image vectors with word embeddings, enhancing interpretability and grounding words in visual context.

Findings

01

High correlation between image vectors and word embeddings

02

Embeddings provide vivid image-based explanations

03

Model improves interpretability of word representations

Abstract

Word embedding is designed to represent the semantic meaning of a word with low dimensional vectors. The state-of-the-art methods of learning word embeddings (word2vec and GloVe) only use the word co-occurrence information. The learned embeddings are real number vectors, which are obscure to human. In this paper, we propose an Image-Enhanced Skip-Gram Model to learn grounded word embeddings by representing the word vectors in the same hyper-plane with image vectors. Experiments show that the image vectors and word embeddings learned by our model are highly correlated, which indicates that our model is able to provide a vivid image-based explanation to the word embeddings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications