Generating Media Background Checks for Automated Source Critical   Reasoning

Michael Schlichtkrull

arXiv:2409.00781·cs.CL·September 4, 2024

Generating Media Background Checks for Automated Source Critical Reasoning

Michael Schlichtkrull

PDF

Open Access 1 Video

TL;DR

This paper introduces a new NLP task of generating media background checks to assess source credibility, supported by a dataset and evaluation of models and human usefulness, addressing a gap in source critical reasoning.

Contribution

It presents a novel dataset of media background checks, defines a new task, and evaluates models and human utility, advancing source criticism in NLP.

Findings

01

Retrieval improves model performance significantly.

02

Media background checks aid human understanding.

03

Background checks are beneficial for retrieval-augmented models.

Abstract

Not everything on the internet is true. This unfortunate fact requires both humans and models to perform complex reasoning about credibility when working with retrieved information. In NLP, this problem has seen little attention. Indeed, retrieval-augmented models are not typically expected to distrust retrieved documents. Human experts overcome the challenge by gathering signals about the context, reliability, and tendency of source documents - that is, they perform source criticism. We propose a novel NLP task focused on finding and summarising such signals. We introduce a new dataset of 6,709 "media background checks" derived from Media Bias / Fact Check, a volunteer-run website documenting media bias. We test open-source and closed-source LLM baselines with and without retrieval on this dataset, finding that retrieval greatly improves performance. We furthermore carry out human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Generating Media Background Checks for Automated Source Critical Reasoning· underline

Taxonomy

TopicsNatural Language Processing Techniques · Web Application Security Vulnerabilities · Topic Modeling