Visualizing NLP annotations for Crowdsourcing

Hanchuan Li; Haichen Shen; Shengliang Xu; Congle Zhang

arXiv:1508.06044·cs.CL·August 26, 2015·2 cites

Visualizing NLP annotations for Crowdsourcing

Hanchuan Li, Haichen Shen, Shengliang Xu, Congle Zhang

PDF

Open Access

TL;DR

CROWDANNO is a visualization toolkit designed to enable non-expert crowdsourced workers to efficiently annotate NLP data for clustering and parsing tasks through an interactive interface, improving scalability and label quality.

Contribution

The paper introduces CROWDANNO, a user-friendly visualization toolkit that simplifies NLP annotation tasks for non-experts, facilitating scalable crowdsourcing with high-quality results.

Findings

01

User studies show high annotation quality from non-experts.

02

Toolkit is easy to use for complex NLP annotation tasks.

03

Source code and toolkit are publicly released.

Abstract

Visualizing NLP annotation is useful for the collection of training data for the statistical NLP approaches. Existing toolkits either provide limited visual aid, or introduce comprehensive operators to realize sophisticated linguistic rules. Workers must be well trained to use them. Their audience thus can hardly be scaled to large amounts of non-expert crowdsourced workers. In this paper, we present CROWDANNO, a visualization toolkit to allow crowd-sourced workers to annotate two general categories of NLP problems: clustering and parsing. Workers can finish the tasks with simplified operators in an interactive interface, and fix errors conveniently. User studies show our toolkit is very friendly to NLP non-experts, and allow them to produce high quality labels for several sophisticated problems. We release our source code and toolkit to spur future research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Mobile Crowdsensing and Crowdsourcing