Rewrite the News: Tracing Editorial Reuse Across News Agencies

Soveatin Kuntur; Nina Smirnova; Anna Wroblewska; Philipp Mayr; Sebastijan Razbor\v{s}ek Ma\v{c}ek

arXiv:2603.29937·cs.CL·April 1, 2026

Rewrite the News: Tracing Editorial Reuse Across News Agencies

Soveatin Kuntur, Nina Smirnova, Anna Wroblewska, Philipp Mayr, Sebastijan Razbor\v{s}ek Ma\v{c}ek

PDF

1 Repo

TL;DR

This study introduces a weakly supervised method to detect cross-lingual sentence reuse in journalism, revealing patterns of editorial content sharing across languages and sources without full translation requirements.

Contribution

It presents a novel approach for identifying multilingual sentence reuse in news articles using publication timestamps and without needing full translations.

Findings

01

Reuse occurs in 52% of Slovenian articles and 1.6% of foreign articles.

02

Reused content is mostly paraphrased and non-literal.

03

Reused sentences tend to appear in the middle and end of articles.

Abstract

This paper investigates sentence-level text reuse in multilingual journalism, analyzing where reused content occurs within articles. We present a weakly supervised method for detecting sentence-level cross-lingual reuse without requiring full translations, designed to support automated pre-selection to reduce information overload for journalists (Holyst et al., 2024). The study compares English-language articles from the Slovenian Press Agency (STA) with reports from 15 foreign agencies (FA) in seven languages, using publication timestamps to retain the earliest likely foreign source for each reused sentence. We analyze 1,037 STA and 237,551 FA articles from two time windows (October 7-November 2, 2023; February 1-28, 2025) and identify 1,087 aligned sentence pairs after filtering to the earliest sources. Reuse occurs in 52% of STA articles and 1.6% of FA articles and is predominantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kunturs/lrec2026-rewrite-news
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.