M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box   Machine-Generated Text Detection

Yuxia Wang; Jonibek Mansurov; Petar Ivanov; Jinyan Su; Artem; Shelmanov; Akim Tsvigun; Chenxi Whitehouse; Osama Mohammed Afzal; Tarek; Mahmoud; Toru Sasaki; Thomas Arnold; Alham Fikri Aji; Nizar Habash; Iryna; Gurevych; Preslav Nakov

arXiv:2305.14902·cs.CL·March 12, 2024·21 cites

M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection

Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem, Shelmanov, Akim Tsvigun, Chenxi Whitehouse, Osama Mohammed Afzal, Tarek, Mahmoud, Toru Sasaki, Thomas Arnold, Alham Fikri Aji, Nizar Habash, Iryna, Gurevych, Preslav Nakov

PDF

Open Access 2 Repos

TL;DR

This paper introduces M4, a comprehensive benchmark dataset for detecting machine-generated text across multiple generators, domains, and languages, highlighting current challenges and guiding future research in this societal concern.

Contribution

The creation of the large-scale M4 benchmark dataset for multi-generator, multi-domain, and multi-lingual machine-generated text detection.

Findings

01

Detectors struggle to generalize to unseen domains and LLMs.

02

Misclassification of machine-generated as human-written texts is common.

03

The dataset reveals significant room for improvement in detection methods.

Abstract

Large language models (LLMs) have demonstrated remarkable capability to generate fluent responses to a wide variety of user queries. However, this has also raised concerns about the potential misuse of such texts in journalism, education, and academia. In this study, we strive to create automated systems that can detect machine-generated texts and pinpoint potential misuse. We first introduce a large-scale benchmark \textbf{M4}, which is a multi-generator, multi-domain, and multi-lingual corpus for machine-generated text detection. Through an extensive empirical study of this dataset, we show that it is challenging for detectors to generalize well on instances from unseen domains or LLMs. In such cases, detectors tend to misclassify machine-generated text as human-written. These results show that the problem is far from solved and that there is a lot of room for improvement. We believe…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Natural Language Processing Techniques · Text Readability and Simplification