Identifying Phrasemes via Interlingual Association Measures -- A   Data-driven Approach on Dependency-parsed and Word-aligned Parallel Corpora

Johannes Gra\"en

arXiv:1709.08196·cs.CL·September 26, 2017

Identifying Phrasemes via Interlingual Association Measures -- A Data-driven Approach on Dependency-parsed and Word-aligned Parallel Corpora

Johannes Gra\"en

PDF

Open Access

TL;DR

This paper introduces a data-driven method for identifying phrasemes across languages using interlingual association measures applied to dependency-parsed and word-aligned parallel corpora, aiming to improve multilingual lexical analysis.

Contribution

It presents a novel approach leveraging interlingual association measures on parallel corpora to automatically identify phrasemes across languages.

Findings

01

Effective identification of cross-lingual phrasemes demonstrated

02

Improved accuracy over previous methods in multilingual contexts

03

Applicable to various language pairs and corpora sizes

Abstract

This is a preprint of the article "Identifying Phrasemes via Interlingual Association Measures" that was presented in February 2016 at the LeKo (Lexical combinations and typified speech in a multilingual context) conference in Innsbruck.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification