TL;DR
This paper explores using transformer models like BERT and XLMRoBERTa for recognizing Ezafe in Persian, improving NLP tasks by integrating Ezafe information into part-of-speech tagging, and analyzing its effectiveness.
Contribution
It demonstrates that transformer models significantly improve Ezafe recognition and investigates the impact of Ezafe information on Persian POS tagging, revealing limitations for transformer-based methods.
Findings
XLMRoBERTa outperforms previous models by 2.68% F1-score in Ezafe recognition.
Ezafe information enhances POS tagging accuracy in traditional models.
Ezafe information does not benefit transformer-based POS tagging methods.
Abstract
Ezafe is a grammatical particle in some Iranian languages that links two words together. Regardless of the important information it conveys, it is almost always not indicated in Persian script, resulting in mistakes in reading complex sentences and errors in natural language processing tasks. In this paper, we experiment with different machine learning methods to achieve state-of-the-art results in the task of ezafe recognition. Transformer-based methods, BERT and XLMRoBERTa, achieve the best results, the latter achieving 2.68% F1-score more than the previous state-of-the-art. We, moreover, use ezafe information to improve Persian part-of-speech tagging results and show that such information will not be useful to transformer-based methods and explain why that might be the case.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Adam · Softmax · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Weight Decay · Dropout · Linear Warmup With Linear Decay · Attention Dropout · Layer Normalization
