Human Associations Help to Detect Conventionalized Multiword Expressions
Natalia Loukachevitch, Anastasia Gerasimova

TL;DR
This study demonstrates that analyzing human associations with phrases and their components can effectively identify conventionalized multiword expressions, using experiments on Russian language data.
Contribution
It introduces a novel method for detecting conventionalized phrases based on human association patterns and applies it to Russian language data.
Findings
Frequent mutual associations of component words indicate phrase conventionalization.
Low entropy of phrase associations signals conventionalized expressions.
Low intersection between component and phrase associations helps identify conventionalized phrases.
Abstract
In this paper we show that if we want to obtain human evidence about conventionalization of some phrases, we should ask native speakers about associations they have to a given phrase and its component words. We have shown that if component words of a phrase have each other as frequent associations, then this phrase can be considered as conventionalized. Another type of conventionalized phrases can be revealed using two factors: low entropy of phrase associations and low intersection of component word and phrase associations. The association experiments were performed for the Russian language.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Authorship Attribution and Profiling · Topic Modeling
