MGTBench: Benchmarking Machine-Generated Text Detection

Xinlei He; Xinyue Shen; Zeyuan Chen; Michael Backes; Yang; Zhang

arXiv:2303.14822·cs.CR·January 17, 2024·31 cites

MGTBench: Benchmarking Machine-Generated Text Detection

Xinlei He, Xinyue Shen, Zeyuan Chen, Michael Backes, Yang, Zhang

PDF

Open Access 4 Repos 1 Models 1 Datasets

TL;DR

This paper introduces MGTBench, a comprehensive benchmark framework for evaluating machine-generated text detection methods against advanced LLMs, highlighting their strengths, limitations, and robustness challenges.

Contribution

The paper presents the first standardized benchmark for MGT detection, providing extensive evaluation, ablation studies, and analysis of robustness against adversarial attacks.

Findings

01

Detection methods perform well with more words and fewer training samples.

02

Larger models like ChatGPT-turbo improve detection accuracy.

03

Adversarial attacks significantly reduce detection effectiveness.

Abstract

Nowadays, powerful large language models (LLMs) such as ChatGPT have demonstrated revolutionary power in a variety of tasks. Consequently, the detection of machine-generated texts (MGTs) is becoming increasingly crucial as LLMs become more advanced and prevalent. These models have the ability to generate human-like language, making it challenging to discern whether a text is authored by a human or a machine. This raises concerns regarding authenticity, accountability, and potential bias. However, existing methods for detecting MGTs are evaluated using different model architectures, datasets, and experimental settings, resulting in a lack of a comprehensive evaluation framework that encompasses various methodologies. Furthermore, it remains unclear how existing detection methods would perform against powerful LLMs. In this paper, we fill this gap by proposing the first benchmark…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
huyen89/MGTDetectionModel
model· 6 dl· ♡ 2
6 dl♡ 2

Datasets

artnitolog/llm-generated-texts
dataset· 544 dl
544 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling