Separated by an Un-common Language: Towards Judgment Language Informed Vector Space Modeling
Ira Leviant, Roi Reichart

TL;DR
This study investigates how the language of judgment influences human semantic similarity scores and demonstrates that multilingual vector space models can better align with human judgments across different languages.
Contribution
The paper introduces multilingual evaluation of semantic models and shows multilingual VSMs can improve correlation with human judgments across languages.
Findings
Human judgments are significantly affected by judgment language.
Monolingual VSMs do not always align best with judgments in their training language.
Multilingual VSMs can enhance correlation with human semantic judgments.
Abstract
A common evaluation practice in the vector space models (VSMs) literature is to measure the models' ability to predict human judgments about lexical semantic relations between word pairs. Most existing evaluation sets, however, consist of scores collected for English word pairs only, ignoring the potential impact of the judgment language in which word pairs are presented on the human scores. In this paper we translate two prominent evaluation sets, wordsim353 (association) and SimLex999 (similarity), from English to Italian, German and Russian and collect scores for each dataset from crowdworkers fluent in its language. Our analysis reveals that human judgments are strongly impacted by the judgment language. Moreover, we show that the predictions of monolingual VSMs do not necessarily best correlate with human judgments made with the language used for model training, suggesting that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
