Keeping Humans in the Loop: Human-Centered Automated Annotation with Generative AI
Nicholas Pangakis, Samuel Wolken

TL;DR
This paper evaluates the use of GPT-4 for automated text annotation in social media research, emphasizing the importance of human oversight and careful validation to ensure responsible and accurate AI-assisted annotation.
Contribution
It introduces a human-centered framework for evaluating AI annotation tools, highlighting variability in GPT-4's performance and the necessity of human validation for responsible use.
Findings
GPT-4 achieves high-quality annotations but varies across tasks.
Automated annotations often diverge from human judgment.
Human validation is crucial for responsible AI annotation.
Abstract
Automated text annotation is a compelling use case for generative large language models (LLMs) in social media research. Recent work suggests that LLMs can achieve strong performance on annotation tasks; however, these studies evaluate LLMs on a small number of tasks and likely suffer from contamination due to a reliance on public benchmark datasets. Here, we test a human-centered framework for responsibly evaluating artificial intelligence tools used in automated annotation. We use GPT-4 to replicate 27 annotation tasks across 11 password-protected datasets from recently published computational social science articles in high-impact journals. For each task, we compare GPT-4 annotations against human-annotated ground-truth labels and against annotations from separate supervised classification models fine-tuned on human-generated labels. Although the quality of LLM labels is generally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · AI-based Problem Solving and Planning
MethodsLinear Layer · Multi-Head Attention · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Softmax · Layer Normalization · Attention Is All You Need · Position-Wise Feed-Forward Layer · Dropout
