Bias in News Summarization: Measures, Pitfalls and Corpora

Julius Steen; Katja Markert

arXiv:2309.08047·cs.CL·June 7, 2024·1 cites

Bias in News Summarization: Measures, Pitfalls and Corpora

Julius Steen, Katja Markert

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates how social biases like gender and race influence news summarization models, introducing new methods to control input bias and analyzing bias effects on content selection and hallucinations.

Contribution

It proposes a controlled setting for bias analysis in summarization, introduces operational definitions of biased behaviors, and evaluates bias effects in both specialized and general models.

Findings

01

Gender bias minimally affects content selection

02

Hallucinations show evidence of bias

03

Method applicable to racial and intersectional bias analysis

Abstract

Summarization is an important application of large language models (LLMs). Most previous evaluation of summarization models has focused on their content selection, faithfulness, grammaticality and coherence. However, it is well known that LLMs can reproduce and reinforce harmful social biases. This raises the question: Do biases affect model outputs in a constrained setting like summarization? To help answer this question, we first motivate and introduce a number of definitions for biased behaviours in summarization models, along with practical operationalizations. Since we find that biases inherent to input documents can confound bias analysis in summaries, we propose a method to generate input documents with carefully controlled demographic attributes. This allows us to study summarizer behavior in a controlled setting, while still working with realistic input documents. We measure…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

julmaxi/summary_bias
noneOfficial

Videos

Bias in News Summarization: Measures, Pitfalls and Corpora· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Authorship Attribution and Profiling