Ensemble Transfer Learning for Multilingual Coreference Resolution
Tuan Manh Lai, Heng Ji

TL;DR
This paper introduces an ensemble transfer learning framework for multilingual coreference resolution, leveraging Wikipedia anchor texts for distantly-supervised data, achieving state-of-the-art results on multiple languages.
Contribution
It proposes a novel ensemble approach combined with a low-cost distantly-supervised training method using Wikipedia anchor texts for multilingual coreference resolution.
Findings
Ensembles outperform baseline by up to 7.68% F1 score.
Achieves new state-of-the-art results for Arabic, Dutch, and Spanish.
Effective use of Wikipedia anchor texts for distantly-supervised training.
Abstract
Entity coreference resolution is an important research problem with many applications, including information extraction and question answering. Coreference resolution for English has been studied extensively. However, there is relatively little work for other languages. A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data. To overcome this challenge, we design a simple but effective ensemble-based framework that combines various transfer learning (TL) techniques. We first train several models using different TL methods. Then, during inference, we compute the unweighted average scores of the models' predictions to extract the final set of predicted clusters. Furthermore, we also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts. Leveraging the idea that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
