Model-based annotation of coreference

Rahul Aralikatte; Anders S{\o}gaard

arXiv:1906.10724·cs.CL·March 3, 2020·1 cites

Model-based annotation of coreference

Rahul Aralikatte, Anders S{\o}gaard

PDF

Open Access 1 Repo

TL;DR

This paper introduces a model-based approach to coreference annotation that links entities to a knowledge base, simplifying the task, increasing efficiency and agreement, and providing new benchmark datasets for evaluation.

Contribution

It proposes a novel model-based annotation method for coreference, especially pronouns, and provides new datasets and evaluation of state-of-the-art resolvers.

Findings

01

Model-based annotation speeds up the annotation process.

02

It results in higher inter-annotator agreement.

03

New benchmark datasets for coreference resolution are introduced.

Abstract

Humans do not make inferences over texts, but over models of what texts are about. When annotators are asked to annotate coreferent spans of text, it is therefore a somewhat unnatural task. This paper presents an alternative in which we preprocess documents, linking entities to a knowledge base, and turn the coreference annotation task -- in our case limited to pronouns -- into an annotation task where annotators are asked to assign pronouns to entities. Model-based annotation is shown to lead to faster annotation and higher inter-annotator agreement, and we argue that it also opens up for an alternative approach to coreference resolution. We present two new coreference benchmark datasets, for English Wikipedia and English teacher-student dialogues, and evaluate state-of-the-art coreference resolvers on them.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rahular/model-based-coref
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification