Annotation Curricula to Implicitly Train Non-Expert Annotators

Ji-Ung Lee; Jan-Christoph Klie; Iryna Gurevych

arXiv:2106.02382·cs.CL·December 23, 2021

Annotation Curricula to Implicitly Train Non-Expert Annotators

Ji-Ung Lee, Jan-Christoph Klie, Iryna Gurevych

PDF

Open Access 1 Repo

TL;DR

This paper introduces annotation curricula, a method to implicitly train non-expert annotators by ordering instances to reduce mental load and improve annotation efficiency, demonstrated through a user study on Covid-19 tweets.

Contribution

It formalizes annotation curricula for sentence and paragraph tasks, proposes an ordering strategy, and validates its effectiveness through experiments and a user study.

Findings

01

Ordering instances reduces annotation time significantly.

02

High annotation quality is maintained despite faster annotation.

03

Simple heuristics can effectively guide instance ordering.

Abstract

Annotation studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain. This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations; especially in citizen science or crowd sourcing scenarios where domain expertise is not required and only annotation guidelines are provided. To alleviate these issues, we propose annotation curricula, a novel approach to implicitly train annotators. Our goal is to gradually introduce annotators into the task by ordering instances that are annotated according to a learning curriculum. To do so, we first formalize annotation curricula for sentence- and paragraph-level annotation tasks, define an ordering strategy, and identify well-performing heuristics and interactively trained models on three existing English datasets. We then conduct a user study…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ukplab/annotation-curriculum
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Misinformation and Its Impacts · Topic Modeling