Does Putting a Linguist in the Loop Improve NLU Data Collection?
Alicia Parrish, William Huang, Omar Agha, Soo-Hwan Lee, Nikita Nangia,, Alex Warstadt, Karmanya Aggarwal, Emily Allaway, Tal Linzen, Samuel R., Bowman

TL;DR
Involving linguists during crowdsourced NLU data collection helps create more challenging datasets and allows for dynamic gap mitigation, but does not necessarily improve out-of-domain model performance.
Contribution
This study demonstrates that real-time expert involvement during data collection enhances dataset quality and challenge level, introducing a novel iterative, linguist-in-the-loop protocol.
Findings
Linguist-in-the-loop datasets are more reliably challenging.
No significant improvement in out-of-domain performance with linguist involvement.
Chatroom interaction between linguists and crowdworkers has no measurable effect.
Abstract
Many crowdsourced NLP datasets contain systematic gaps and biases that are identified only after data collection is complete. Identifying these issues from early data samples during crowdsourcing should make mitigation more efficient, especially when done iteratively. We take natural language inference as a test case and ask whether it is beneficial to put a linguist `in the loop' during data collection to dynamically identify and address gaps in the data by introducing novel constraints on the task. We directly compare three data collection protocols: (i) a baseline protocol, (ii) a linguist-in-the-loop intervention with iteratively-updated constraints on the task, and (iii) an extension of linguist-in-the-loop that provides direct interaction between linguists and crowdworkers via a chatroom. The datasets collected with linguist involvement are more reliably challenging than baseline,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7model· 193k dl· ♡ 355193k dl♡ 355
- 🤗MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanlimodel· 38k dl· ♡ 12038k dl♡ 120
- 🤗MoritzLaurer/DeBERTa-v3-base-mnli-fever-docnli-ling-2cmodel· 744 dl· ♡ 12744 dl♡ 12
- 🤗MoritzLaurer/DeBERTa-v3-small-mnli-fever-docnli-ling-2cmodel· 87 dl87 dl
- 🤗MoritzLaurer/DeBERTa-v3-xsmall-mnli-fever-anli-ling-binarymodel· 60k dl· ♡ 660k dl♡ 6
- 🤗MoritzLaurer/MiniLM-L6-mnli-fever-docnli-ling-2cmodel· 13 dl· ♡ 213 dl♡ 2
- 🤗MoritzLaurer/xtremedistil-l6-h256-mnli-fever-anli-ling-binarymodel· 27 dl· ♡ 327 dl♡ 3
- 🤗pkshatech/GLuCoSE-base-jamodel· 20k dl· ♡ 3420k dl♡ 34
- 🤗Lowerated/lm6-deberta-v3-topic-sentimentmodel· 3 dl· ♡ 23 dl♡ 2
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Mobile Crowdsensing and Crowdsourcing
