Frame In, Frame Out: Measuring Framing Bias in LLM-Generated News Summaries
Valeria Pastorino, Nafise Sadat Moosavi

TL;DR
This paper introduces FIFO, a large-scale benchmark for measuring framing bias in LLM-generated news summaries, revealing that such summaries often exhibit higher framing than human references.
Contribution
The paper presents FIFO, the first benchmark for framing detection in news summaries, and provides analysis of framing bias across multiple LLM summarization models.
Findings
LLM summaries show higher framing rates than human references.
Framing bias varies across topics and training regimes.
Scientific and public health summaries have elevated framing rates.
Abstract
News headlines and summaries shape how events are interpreted through selective emphasis and omission, a phenomenon commonly referred to as framing. Large language models are now routinely used to generate such content, yet existing evaluation frameworks largely overlook this dimension. We introduce Frame In, Frame Out (FIFO), the first large-scale benchmark for measuring framing presence in LLM-generated news summaries, grounded in the widely used XSum dataset. FIFO combines 15,499 jury-annotated examples with 320 expert-labeled instances () to validate and calibrate model-based annotations. Using FIFO, we analyze measured framing rates across 27 summarization models. We find that LLM-generated summaries often exhibit higher calibrated framing rates than human-written references, with substantial variation across topics and training regimes, including elevated rates in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
