BLUFF: Benchmarking the Detection of False and Synthetic Content across 58 Low-Resource Languages
Jason Lucas, Matt Murtagh-White, Adaku Uchendu, Ali Al-Lawati, Michiharu Yamashita, Dominik Macko, Ivan Srba, Robert Moro, Dongwon Lee

TL;DR
BLUFF is a large-scale multilingual benchmark dataset designed to evaluate false and synthetic content detection across 79 languages, addressing the gap in low-resource language coverage and providing tools for advancing equitable misinformation detection.
Contribution
The paper introduces BLUFF, a comprehensive multilingual benchmark dataset covering 79 languages, with novel content generation and filtering methods, to improve false content detection in low-resource languages.
Findings
State-of-the-art detectors degrade up to 25.3% F1 on low-resource languages.
BLUFF covers both high-resource and low-resource languages, filling a critical research gap.
Extensive linguistic-oriented evaluation and open-source tools are provided.
Abstract
Multilingual falsehoods threaten information integrity worldwide, yet detection benchmarks remain confined to English or a few high-resource languages, leaving low-resource linguistic communities without robust defense tools. We introduce BLUFF, a comprehensive benchmark for detecting false and synthetic content, spanning 79 languages with over 202K samples, combining human-written fact-checked content (122K+ samples across 57 languages) and LLM-generated content (79K+ samples across 71 languages). BLUFF uniquely covers both high-resource "big-head" (20) and low-resource "long-tail" (59) languages, addressing critical gaps in multilingual research on detecting false and synthetic content. Our dataset features four content types (human-written, LLM-generated, LLM-translated, and hybrid human-LLM text), bidirectional translation (EnglishX), 39 textual modification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Hate Speech and Cyberbullying Detection · Spam and Phishing Detection
