TL;DR
This paper investigates how large language models generate and propagate misinformation across different languages and countries, revealing systematic biases and challenges in mitigation strategies.
Contribution
It introduces GlobalLies, a multilingual dataset for misinformation generation, and analyzes biases in LLMs' misinformation spread across languages and regions.
Findings
Misinformation generation varies systematically by country and language.
Lower-resource languages and countries with lower HDI experience more misinformation propagation.
Existing mitigation strategies show uneven effectiveness across regions.
Abstract
Misinformation is on the rise, and the strong writing capabilities of LLMs lower the barrier for malicious actors to produce and disseminate false information. We study how LLMs behave when prompted to spread misinformation across languages and target countries, and introduce GlobalLies, a multilingual parallel dataset of 440 misinformation generation prompt templates and 6,867 entities, spanning 8 languages and 195 countries. Using both human annotations and large-scale LLM-as-a-judge evaluations across hundreds of thousands of generations from state-of-the-art models, we show that misinformation generation varies systematically based on the country being discussed. Propagation of lies by LLMs is substantially higher in many lower-resource languages and for countries with a lower Human Development Index (HDI). We find that existing mitigation strategies provide uneven protection: input…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
