Investigating Idiomaticity in Word Representations
Wei He, Tiago Kramer Vieira, Marcos Garcia, Carolina Scarton, Marco, Idiart, Aline Villavicencio

TL;DR
This paper evaluates how well current word representation models capture idiomatic meanings in multiword expressions, revealing that they still struggle to accurately represent idiomaticity despite high superficial similarity scores.
Contribution
It introduces a new dataset of noun compounds with human idiomaticity judgments and proposes metrics to assess models' sensitivity to idiomaticity changes.
Findings
Models do not accurately capture idiomaticity despite high similarity scores.
Contextualized models still rely on superficial lexical clues.
Current models struggle to incorporate relevant semantic cues for idiomatic expressions.
Abstract
Idiomatic expressions are an integral part of human languages, often used to express complex ideas in compressed or conventional ways (e.g. eager beaver as a keen and enthusiastic person). However, their interpretations may not be straightforwardly linked to the meanings of their individual components in isolation and this may have an impact for compositional approaches. In this paper, we investigate to what extent word representation models are able to go beyond compositional word combinations and capture multiword expression idiomaticity and some of the expected properties related to idiomatic meanings. We focus on noun compounds of varying levels of idiomaticity in two languages (English and Portuguese), presenting a dataset of minimal pairs containing human idiomaticity judgments for each noun compound at both type and token levels, their paraphrases and their occurrences in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsLanguage, Metaphor, and Cognition · Natural Language Processing Techniques · Translation Studies and Practices
MethodsSparse Evolutionary Training · Focus
