The Promises and Pitfalls of LLM Annotations in Dataset Labeling: a Case   Study on Media Bias Detection

Tomas Horych; Christoph Mandl; Terry Ruas; Andre Greiner-Petter; Bela; Gipp; Akiko Aizawa; and Timo Spinde

arXiv:2411.11081·cs.CL·January 27, 2025

The Promises and Pitfalls of LLM Annotations in Dataset Labeling: a Case Study on Media Bias Detection

Tomas Horych, Christoph Mandl, Terry Ruas, Andre Greiner-Petter, Bela, Gipp, Akiko Aizawa, and Timo Spinde

PDF

Open Access 1 Repo 3 Models 1 Video

TL;DR

This paper explores using Large Language Models to automate media bias annotation, creating a large dataset that enables training effective bias classifiers while analyzing the benefits and limitations of this approach.

Contribution

It introduces annolexical, the first large-scale media bias dataset annotated by LLMs, and demonstrates that classifiers trained on this data outperform LLM annotators and rival human-labeled data.

Findings

01

Classifier trained on LLM-annotated data outperforms LLM annotators by 5-9% MCC.

02

The dataset enables cost-effective media bias classification.

03

Limitations and trade-offs of LLM-based annotation are identified.

Abstract

High annotation costs from hiring or crowdsourcing complicate the creation of large, high-quality datasets needed for training reliable text classifiers. Recent research suggests using Large Language Models (LLMs) to automate the annotation process, reducing these costs while maintaining data quality. LLMs have shown promising results in annotating downstream tasks like hate speech detection and political framing. Building on the success in these areas, this study investigates whether LLMs are viable for annotating the complex task of media bias detection and whether a downstream media bias classifier can be trained on such data. We create annolexical, the first large-scale dataset for media bias classification with over 48000 synthetically annotated examples. Our classifier, fine-tuned on this dataset, surpasses all of the annotator LLMs by 5-9 percent in Matthews Correlation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Media-Bias-Group/llm-annotations-annomatic
noneOfficial

Models

Videos

The Promises and Pitfalls of LLM Annotations in Dataset Labeling: a Case Study on Media Bias Detection· underline

Taxonomy

TopicsComputational and Text Analysis Methods