Interface Design for Crowdsourcing Hierarchical Multi-Label Text Annotations
Rickard Stureborg, Bhuwan Dhingra, Jun Yang

TL;DR
This paper explores how incorporating concept hierarchies into crowdsourcing interfaces enhances the quality and efficiency of hierarchical multi-label text annotations, especially for complex tasks like vaccine misinformation labeling.
Contribution
It demonstrates that hierarchical interface designs improve annotation quality and efficiency, with specific strategies like grouping concepts and filtering negatives showing significant benefits.
Findings
Hierarchical interfaces increase F1 scores by +0.16 over random groupings.
Performance is notably better on high-difficulty examples, with a +0.40 relative F1 score increase.
Filtering out negatives raises precision by +0.07.
Abstract
Human data labeling is an important and expensive task at the heart of supervised learning systems. Hierarchies help humans understand and organize concepts. We ask whether and how concept hierarchies can inform the design of annotation interfaces to improve labeling quality and efficiency. We study this question through annotation of vaccine misinformation, where the labeling task is difficult and highly subjective. We investigate 6 user interface designs for crowdsourcing hierarchical labels by collecting over 18,000 individual annotations. Under a fixed budget, integrating hierarchies into the design improves crowdsource workers' F1 scores. We attribute this to (1) Grouping similar concepts, improving F1 scores by +0.16 over random groupings, (2) Strong relative performance on high-difficulty examples (relative F1 score difference of +0.40), and (3) Filtering out obvious negatives,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
