Multilingual Extraction and Categorization of Lexical Collocations with   Graph-aware Transformers

Luis Espinosa-Anke; Alexander Shvets; Alireza Mohammadshahi and; James Henderson; Leo Wanner

arXiv:2205.11456·cs.CL·May 24, 2022·1 cites

Multilingual Extraction and Categorization of Lexical Collocations with Graph-aware Transformers

Luis Espinosa-Anke, Alexander Shvets, Alireza Mohammadshahi and, James Henderson, Leo Wanner

PDF

Open Access

TL;DR

This paper introduces a BERT-based sequence tagging model with a graph-aware transformer architecture for recognizing and categorizing lexical collocations across multiple languages, emphasizing the importance of syntactic dependency encoding.

Contribution

It presents a novel graph-aware transformer-enhanced BERT model specifically designed for multilingual lexical collocation recognition, improving understanding of collocation typification in different languages.

Findings

01

Explicit syntactic dependency encoding improves model performance.

02

The model effectively recognizes collocations in English, Spanish, and French.

03

Insights into language-specific collocation patterns were obtained.

Abstract

Recognizing and categorizing lexical collocations in context is useful for language learning, dictionary compilation and downstream NLP. However, it is a challenging task due to the varying degrees of frozenness lexical collocations exhibit. In this paper, we put forward a sequence tagging BERT-based model enhanced with a graph-aware transformer architecture, which we evaluate on the task of collocation recognition in context. Our results suggest that explicitly encoding syntactic dependencies in the model architecture is helpful, and provide insights on differences in collocation typification in English, Spanish and French.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling