Predicting Concreteness and Imageability of Words Within and Across   Languages via Word Embeddings

Nikola Ljube\v{s}i\'c; Darja Fi\v{s}er; Anita Peti-Stanti\'c

arXiv:1807.02903·cs.CL·September 15, 2022·1 cites

Predicting Concreteness and Imageability of Words Within and Across Languages via Word Embeddings

Nikola Ljube\v{s}i\'c, Darja Fi\v{s}er, Anita Peti-Stanti\'c

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that concreteness and imageability of words can be effectively predicted within and across languages using supervised learning on word embeddings, with cross-lingual transfer outperforming dictionary-based methods.

Contribution

It introduces a method to predict concreteness and imageability across languages using aligned cross-lingual embeddings, showing high predictability and transfer efficiency.

Findings

01

High predictability of concreteness and imageability within languages.

02

Moderate correlation loss (up to 20%) across languages.

03

Cross-lingual transfer via embeddings outperforms dictionary transfer.

Abstract

The notions of concreteness and imageability, traditionally important in psycholinguistics, are gaining significance in semantic-oriented natural language processing tasks. In this paper we investigate the predictability of these two concepts via supervised learning, using word embeddings as explanatory variables. We perform predictions both within and across languages by exploiting collections of cross-lingual embeddings aligned to a single vector space. We show that the notions of concreteness and imageability are highly predictable both within and across languages, with a moderate loss of up to 20% in correlation when predicting across languages. We further show that the cross-lingual transfer via word embeddings is more efficient than the simple transfer via bilingual dictionaries.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

clarinsi/megahr-crossling
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems