Persian Ezafe Recognition Using Transformers and Its Role in   Part-Of-Speech Tagging

Ehsan Doostmohammadi; Minoo Nassajian; Adel Rahimi

arXiv:2009.09474·cs.CL·October 6, 2020

Persian Ezafe Recognition Using Transformers and Its Role in Part-Of-Speech Tagging

Ehsan Doostmohammadi, Minoo Nassajian, Adel Rahimi

PDF

1 Repo

TL;DR

This paper explores using transformer models like BERT and XLMRoBERTa for recognizing Ezafe in Persian, improving NLP tasks by integrating Ezafe information into part-of-speech tagging, and analyzing its effectiveness.

Contribution

It demonstrates that transformer models significantly improve Ezafe recognition and investigates the impact of Ezafe information on Persian POS tagging, revealing limitations for transformer-based methods.

Findings

01

XLMRoBERTa outperforms previous models by 2.68% F1-score in Ezafe recognition.

02

Ezafe information enhances POS tagging accuracy in traditional models.

03

Ezafe information does not benefit transformer-based POS tagging methods.

Abstract

Ezafe is a grammatical particle in some Iranian languages that links two words together. Regardless of the important information it conveys, it is almost always not indicated in Persian script, resulting in mistakes in reading complex sentences and errors in natural language processing tasks. In this paper, we experiment with different machine learning methods to achieve state-of-the-art results in the task of ezafe recognition. Transformer-based methods, BERT and XLMRoBERTa, achieve the best results, the latter achieving 2.68% F1-score more than the previous state-of-the-art. We, moreover, use ezafe information to improve Persian part-of-speech tagging results and show that such information will not be useful to transformer-based methods and explain why that might be the case.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

edoost/pert
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Adam · Softmax · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Weight Decay · Dropout · Linear Warmup With Linear Decay · Attention Dropout · Layer Normalization