A hybrid entity-centric approach to Persian pronoun resolution
Hassan Haji Mohammadi, Alireza Talebpour, Ahmad Mahmoudi Aznaveh,, Samaneh Yazdani

TL;DR
This paper introduces a hybrid rule-based and machine learning approach for Persian pronoun resolution, utilizing a new corpus and demonstrating improved performance over previous models.
Contribution
It presents a novel hybrid model combining rule-based sieves with a machine learning sieve specifically for Persian pronoun resolution, along with a new Persian coreference corpus.
Findings
Improved accuracy on Persian coreference datasets
Development of a new Persian coreference corpus
Effective combination of rule-based and machine learning methods
Abstract
Pronoun resolution is a challenging subset of an essential field in natural language processing called coreference resolution. Coreference resolution is about finding all entities in the text that refers to the same real-world entity. This paper presents a hybrid model combining multiple rulebased sieves with a machine-learning sieve for pronouns. For this purpose, seven high-precision rule-based sieves are designed for the Persian language. Then, a random forest classifier links pronouns to the previous partial clusters. The presented method demonstrates exemplary performance using pipeline design and combining the advantages of machine learning and rulebased methods. This method has solved some challenges in end-to-end models. In this paper, the authors develop a Persian coreference corpus called Mehr in the form of 400 documents. This corpus fixes some weaknesses of the previous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text and Document Classification Technologies
MethodsTest
