The MediaSpin Dataset: Post-Publication News Headline Edits Annotated for Media Bias

Preetika Verma; Kokil Jaidka

arXiv:2412.02271·cs.CL·May 18, 2026

The MediaSpin Dataset: Post-Publication News Headline Edits Annotated for Media Bias

Preetika Verma, Kokil Jaidka

PDF

TL;DR

MediaSpin is a large dataset capturing how major news outlets modify headlines post-publication, enabling analysis of media bias, framing, and social media engagement through annotated headline pairs and downstream applications.

Contribution

This work introduces MediaSpin, a novel large-scale dataset with annotated headline edits for media bias, and demonstrates its utility in bias classification and behavioral analysis.

Findings

01

Regional differences in headline framing and bias.

02

Measurable linguistic markers of bias.

03

Higher engagement with biased headlines.

Abstract

We present MediaSpin, a large-scale language resource capturing how major news outlets modify headlines after publication, and MediaSpin-in-the-Wild, a complementary dataset linking these revised headlines to their downstream engagement on social media. The increasing editability of online news headlines offers new opportunities to study linguistic framing and bias through the lens of editorial revisions. The dataset contains 78,910 headline pairs annotated for 13 types of media bias, grounded in established media-bias taxonomies, covering both subjective (e.g., sensationalism, spin) and objective (e.g., omission, slant) forms, with annotation conducted through a human-supervised large-language-model pipeline with expert validation and quality control. We describe the annotation schema and demonstrate three downstream applications: (1) cross-national analysis of how country references…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.