ARTICLE: Annotator Reliability Through In-Context Learning
Sujan Dutta, Deepak Pandita, Tharindu Cyril Weerasooriya, Marcos, Zampieri, Christopher M. Homan, Ashiqur R. KhudaBukhsh

TL;DR
This paper introduces ARTICLE, an in-context learning framework that assesses annotator reliability by leveraging self-consistency, aiming to enhance data quality in subjective NLP tasks like offensive speech detection.
Contribution
The paper proposes a novel ICL-based method for evaluating annotator quality, addressing subjectivity and disagreement issues in NLP annotations.
Findings
ARTICLE effectively identifies reliable annotators.
The framework outperforms traditional quality assessment methods.
It improves data quality in subjective NLP tasks.
Abstract
Ensuring annotator quality in training and evaluation data is a key piece of machine learning in NLP. Tasks such as sentiment analysis and offensive speech detection are intrinsically subjective, creating a challenging scenario for traditional quality assessment approaches because it is hard to distinguish disagreement due to poor work from that due to differences of opinions between sincere annotators. With the goal of increasing diverse perspectives in annotation while ensuring consistency, we propose \texttt{ARTICLE}, an in-context learning (ICL) framework to estimate annotation quality through self-consistency. We evaluate this framework on two offensive speech datasets using multiple LLMs and compare its performance with traditional methods. Our findings indicate that \texttt{ARTICLE} can be used as a robust method for identifying reliable annotators, hence improving data quality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSpeech and dialogue systems · Speech Recognition and Synthesis · Music and Audio Processing
