M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection
Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem, Shelmanov, Akim Tsvigun, Osama Mohanned Afzal, Tarek Mahmoud, Giovanni, Puccetti, Thomas Arnold, Alham Fikri Aji, Nizar Habash, Iryna Gurevych,, Preslav Nakov

TL;DR
This paper introduces M4GT-Bench, a comprehensive multilingual, multi-domain benchmark for evaluating machine-generated text detection methods across various tasks, highlighting the challenges and current performance levels.
Contribution
The paper presents M4GT-Bench, the first extensive benchmark for evaluating multilingual, multi-domain, and multi-generator machine-generated text detection methods.
Findings
Detection performance improves with access to domain-specific training data.
Human detection accuracy is lower than automated methods.
Multi-task evaluation reveals varying challenges across detection types.
Abstract
The advent of Large Language Models (LLMs) has brought an unprecedented surge in machine-generated text (MGT) across diverse channels. This raises legitimate concerns about its potential misuse and societal implications. The need to identify and differentiate such content from genuine human-generated text is critical in combating disinformation, preserving the integrity of education and scientific fields, and maintaining trust in communication. In this work, we address this problem by introducing a new benchmark based on a multilingual, multi-domain, and multi-generator corpus of MGTs -- M4GT-Bench. The benchmark is compiled of three tasks: (1) mono-lingual and multi-lingual binary MGT detection; (2) multi-way detection where one need to identify, which particular model generated the text; and (3) mixed human-machine text detection, where a word boundary delimiting MGT from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Text and Document Classification Technologies
