UAlberta at SemEval 2022 Task 2: Leveraging Glosses and Translations for Multilingual Idiomaticity Detection
Bradley Hauer, Seeratpal Jaura, Talgat Omarov, Grzegorz Kondrak

TL;DR
This paper presents systems from the University of Alberta for multilingual idiomaticity detection, utilizing lexical knowledge and translation-based methods to distinguish idiomatic expressions from literal ones in multiple languages.
Contribution
It introduces two linguistically grounded methods: one leveraging word meanings and another using translation differences to detect idiomatic expressions.
Findings
The word meaning-based method performs well in detection.
Translation-based approach effectively identifies idiomatic expressions.
Both methods support the linguistic assumptions about idiomaticity.
Abstract
We describe the University of Alberta systems for the SemEval-2022 Task 2 on multilingual idiomaticity detection. Working under the assumption that idiomatic expressions are noncompositional, our first method integrates information on the meanings of the individual words of an expression into a binary classifier. Further hypothesizing that literal and idiomatic expressions translate differently, our second method translates an expression in context, and uses a lexical knowledge base to determine if the translation is literal. Our approaches are grounded in linguistic phenomena, and leverage existing sources of lexical knowledge. Our results offer support for both approaches, particularly the former.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Translation Studies and Practices
MethodsBalanced Selection
