Applying Ensemble Methods to Model-Agnostic Machine-Generated Text   Detection

Ivan Ong; Boon King Quek

arXiv:2406.12570·cs.CL·June 19, 2024

Applying Ensemble Methods to Model-Agnostic Machine-Generated Text Detection

Ivan Ong, Boon King Quek

PDF

Open Access

TL;DR

This paper explores ensemble methods to improve the detection of machine-generated text, especially when the underlying language model is unknown, achieving high accuracy with simple statistics and supervised learning.

Contribution

It introduces ensemble techniques applied to DetectGPT outputs, enhancing model-agnostic detection accuracy without requiring prior knowledge of the generative model.

Findings

01

Summary statistics improve AUROC from 0.61 to 0.73

02

Supervised learning boosts AUROC to 0.94

03

Method maintains zero-shot capability with simple features

Abstract

In this paper, we study the problem of detecting machine-generated text when the large language model (LLM) it is possibly derived from is unknown. We do so by apply ensembling methods to the outputs from DetectGPT classifiers (Mitchell et al. 2023), a zero-shot model for machine-generated text detection which is highly accurate when the generative (or base) language model is the same as the discriminative (or scoring) language model. We find that simple summary statistics of DetectGPT sub-model outputs yield an AUROC of 0.73 (relative to 0.61) while retaining its zero-shot nature, and that supervised learning methods sharply boost the accuracy to an AUROC of 0.94 but require a training dataset. This suggests the possibility of further generalisation to create a highly-accurate, model-agnostic machine-generated text detector.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Text and Document Classification Technologies · Natural Language Processing Techniques