Estimating Agreement by Chance for Sequence Annotation
Diya Li, Carolyn Ros\'e, Ao Yuan, Chunxiao Zhou

TL;DR
This paper introduces a new model for estimating chance agreement in sequence annotation tasks in NLP, addressing a gap in reliability assessment methods and providing a tool for more accurate annotation evaluation.
Contribution
It presents a novel randomization model for sequence annotations and derives an analytical distribution for chance agreement estimation, validated through simulations and corpus evaluation.
Findings
Accurately estimates chance agreement in sequence annotation
Validates the model's effectiveness through simulations
Provides a practical tool for NLP annotation reliability
Abstract
In the field of natural language processing, correction of performance assessment for chance agreement plays a crucial role in evaluating the reliability of annotations. However, there is a notable dearth of research focusing on chance correction for assessing the reliability of sequence annotation tasks, despite their widespread prevalence in the field. To address this gap, this paper introduces a novel model for generating random annotations, which serves as the foundation for estimating chance agreement in sequence annotation tasks. Utilizing the proposed randomization model and a related comparison approach, we successfully derive the analytical form of the distribution, enabling the computation of the probable location of each annotated text segment and subsequent chance agreement estimation. Through a combination simulation and corpus-based evaluation, we successfully assess its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
