Crafting Tomorrow's Headlines: Neural News Generation and Detection in English, Turkish, Hungarian, and Persian
Cem \"Uy\"uk, Danica Rov\'o, Shaghayegh Kolli, Rabia Varol, Georg, Groh, Daryna Dementieva

TL;DR
This paper introduces a multilingual benchmark dataset and evaluates various detection methods to identify machine-generated news across four languages, addressing misinformation challenges posed by advanced LLMs.
Contribution
It provides a novel benchmark dataset for neural news detection in four languages and evaluates multiple detection approaches, including linguistic features and Transformer models.
Findings
Detection performance varies across languages and models.
Transformer-based detectors show promising robustness.
Multilingual detection remains a challenging task.
Abstract
In the era dominated by information overload and its facilitation with Large Language Models (LLMs), the prevalence of misinformation poses a significant threat to public discourse and societal well-being. A critical concern at present involves the identification of machine-generated news. In this work, we take a significant step by introducing a benchmark dataset designed for neural news detection in four languages: English, Turkish, Hungarian, and Persian. The dataset incorporates outputs from multiple multilingual generators (in both, zero-shot and fine-tuned setups) such as BloomZ, LLaMa-2, Mistral, Mixtral, and GPT-4. Next, we experiment with a variety of classifiers, ranging from those based on linguistic features to advanced Transformer-based models and LLMs prompting. We present the detection results aiming to delve into the interpretablity and robustness of machine-generated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗tum-nlp/neural-news-discriminator-BERT-humodel· 11 dl11 dl
- 🤗tum-nlp/neural-news-discriminator-BERT-enmodel· 7 dl· ♡ 17 dl♡ 1
- 🤗tum-nlp/neural-news-discriminator-BERT-trmodel· 8 dl8 dl
- 🤗tum-nlp/neural-news-discriminator-BERT-famodel· 6 dl6 dl
- 🤗tum-nlp/neural-news-discriminator-RoBERTa-enmodel· 4 dl4 dl
- 🤗tum-nlp/neural-news-discriminator-RoBERTa-trmodel· 2 dl2 dl
- 🤗tum-nlp/neural-news-discriminator-RoBERTa-humodel· 5 dl5 dl
- 🤗tum-nlp/neural-news-discriminator-RoBERTa-famodel· 7 dl7 dl
Videos
Taxonomy
TopicsNatural Language Processing Techniques
MethodsAttention Is All You Need · Linear Layer · Residual Connection · Multi-Head Attention · Adam · Layer Normalization · Position-Wise Feed-Forward Layer · Dense Connections · Byte Pair Encoding · Absolute Position Encodings
