The DALPHI annotation framework & how its pre-annotations can improve   annotator efficiency

Robert Greinacher; Franziska Horn

arXiv:1808.05558·cs.IR·August 20, 2018

The DALPHI annotation framework & how its pre-annotations can improve annotator efficiency

Robert Greinacher, Franziska Horn

PDF

Open Access 2 Repos

TL;DR

This paper introduces DALPHI, an annotation framework that uses active learning to provide pre-annotations, significantly improving annotation efficiency and quality in NLP tasks, demonstrated through a study on named entity annotation.

Contribution

The paper presents DALPHI, a novel annotation framework that leverages active learning to enhance annotation efficiency and quality with pre-annotations, even at 50% recall.

Findings

01

Pre-annotations improve annotation quality and quantity.

02

Active learning-based assistance reduces annotator effort.

03

Even with 50% recall, pre-annotations are beneficial.

Abstract

Producing the required amounts of training data for machine learning and NLP tasks often involves human annotators doing very repetitive and monotonous work. In this paper, we present and evaluate our novel annotation framework DALPHI, which facilitates the annotation process by providing the annotator with suggestions generated by an automated, active-learning based assistance system. In a study with 66 participants, we demonstrate on the exemplary task of annotating named entities in text documents that with this assistance system the annotation processes can be improved with respect to the quality and quantity of produced annotations, even if the pre-annotations provided by the assistance system are at a recall level of only 50%.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Software Engineering Research · Machine Learning and Data Classification