MGTBench: Benchmarking Machine-Generated Text Detection
Xinlei He, Xinyue Shen, Zeyuan Chen, Michael Backes, Yang, Zhang

TL;DR
This paper introduces MGTBench, a comprehensive benchmark framework for evaluating machine-generated text detection methods against advanced LLMs, highlighting their strengths, limitations, and robustness challenges.
Contribution
The paper presents the first standardized benchmark for MGT detection, providing extensive evaluation, ablation studies, and analysis of robustness against adversarial attacks.
Findings
Detection methods perform well with more words and fewer training samples.
Larger models like ChatGPT-turbo improve detection accuracy.
Adversarial attacks significantly reduce detection effectiveness.
Abstract
Nowadays, powerful large language models (LLMs) such as ChatGPT have demonstrated revolutionary power in a variety of tasks. Consequently, the detection of machine-generated texts (MGTs) is becoming increasingly crucial as LLMs become more advanced and prevalent. These models have the ability to generate human-like language, making it challenging to discern whether a text is authored by a human or a machine. This raises concerns regarding authenticity, accountability, and potential bias. However, existing methods for detecting MGTs are evaluated using different model architectures, datasets, and experimental settings, resulting in a lack of a comprehensive evaluation framework that encompasses various methodologies. Furthermore, it remains unclear how existing detection methods would perform against powerful LLMs. In this paper, we fill this gap by proposing the first benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
