TagRuler: Interactive Tool for Span-Level Data Programming by Demonstration
Dongjin Choi, Sara Evensen, \c{C}a\u{g}atay Demiralp, Estevam, Hruschka

TL;DR
TagRuler is an interactive tool that enables non-programmers to create span-level data labeling functions efficiently, improving annotation quality and reducing time for NLP tasks through demonstration-based data programming.
Contribution
This work extends the Data Programming by Demonstration framework to span-level NLP annotation, introducing a user-friendly tool that enhances annotation accuracy without requiring programming skills.
Findings
Higher F1 scores achieved with TagRuler compared to manual labeling
Effective span-level annotation facilitated without programming
Empirical validation across multiple NLP tasks
Abstract
Despite rapid developments in the field of machine learning research, collecting high-quality labels for supervised learning remains a bottleneck for many applications. This difficulty is exacerbated by the fact that state-of-the-art models for NLP tasks are becoming deeper and more complex, often increasing the amount of training data required even for fine-tuning. Weak supervision methods, including data programming, address this problem and reduce the cost of label collection by using noisy label sources for supervision. However, until recently, data programming was only accessible to users who knew how to program. To bridge this gap, the Data Programming by Demonstration framework was proposed to facilitate the automatic creation of labeling functions based on a few examples labeled by a domain expert. This framework has proven successful for generating high-accuracy labeling models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification
