A hybrid entity-centric approach to Persian pronoun resolution

Hassan Haji Mohammadi; Alireza Talebpour; Ahmad Mahmoudi Aznaveh,; Samaneh Yazdani

arXiv:2211.06257·cs.CL·November 14, 2022

A hybrid entity-centric approach to Persian pronoun resolution

Hassan Haji Mohammadi, Alireza Talebpour, Ahmad Mahmoudi Aznaveh,, Samaneh Yazdani

PDF

Open Access

TL;DR

This paper introduces a hybrid rule-based and machine learning approach for Persian pronoun resolution, utilizing a new corpus and demonstrating improved performance over previous models.

Contribution

It presents a novel hybrid model combining rule-based sieves with a machine learning sieve specifically for Persian pronoun resolution, along with a new Persian coreference corpus.

Findings

01

Improved accuracy on Persian coreference datasets

02

Development of a new Persian coreference corpus

03

Effective combination of rule-based and machine learning methods

Abstract

Pronoun resolution is a challenging subset of an essential field in natural language processing called coreference resolution. Coreference resolution is about finding all entities in the text that refers to the same real-world entity. This paper presents a hybrid model combining multiple rulebased sieves with a machine-learning sieve for pronouns. For this purpose, seven high-precision rule-based sieves are designed for the Persian language. Then, a random forest classifier links pronouns to the previous partial clusters. The presented method demonstrates exemplary performance using pipeline design and combining the advantages of machine learning and rulebased methods. This method has solved some challenges in end-to-end models. In this paper, the authors develop a Persian coreference corpus called Mehr in the form of 400 documents. This corpus fixes some weaknesses of the previous…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text and Document Classification Technologies

MethodsTest