DpgMedia2019: A Dutch News Dataset for Partisanship Detection

Chia-Lun Yeh; Babak Loni; Mari\"elle Hendriks; Henrike Reinhardt; Anne; Schuth

arXiv:1908.02322·cs.CL·August 8, 2019·1 cites

DpgMedia2019: A Dutch News Dataset for Partisanship Detection

Chia-Lun Yeh, Babak Loni, Mari\"elle Hendriks, Henrike Reinhardt, Anne, Schuth

PDF

Open Access 1 Repo

TL;DR

This paper introduces a large Dutch news dataset with publisher and article-level partisanship labels, enabling research on media bias and partisanship detection in Dutch news articles.

Contribution

The paper provides a new Dutch news dataset with over 100K publisher-labeled articles and 776 crowd-labeled articles, detailing its collection, annotation, and potential applications.

Findings

01

Dataset contains over 100K publisher-labeled articles

02

Crowdsourced 776 articles with detailed labels

03

Facilitates research on media partisanship in Dutch news

Abstract

We present a new Dutch news dataset with labeled partisanship. The dataset contains more than 100K articles that are labeled on the publisher level and 776 articles that were crowdsourced using an internal survey platform and labeled on the article level. In this paper, we document our original motivation, the collection and annotation process, limitations, and applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dpgmedia/partisan-news2019
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Misinformation and Its Impacts · Hate Speech and Cyberbullying Detection