Smaller Language Models are Better Black-box Machine-Generated Text   Detectors

Niloofar Mireshghallah; Justus Mattern; Sicun Gao; Reza Shokri; Taylor; Berg-Kirkpatrick

arXiv:2305.09859·cs.CL·February 27, 2024·5 cites

Smaller Language Models are Better Black-box Machine-Generated Text Detectors

Niloofar Mireshghallah, Justus Mattern, Sicun Gao, Reza Shokri, Taylor, Berg-Kirkpatrick

PDF

Open Access 1 Repo

TL;DR

This paper shows that smaller, partially-trained language models are more effective at detecting machine-generated text across various models, regardless of training data similarity.

Contribution

It introduces a black-box detection method that leverages smaller models, demonstrating their superior ability to identify generated text from larger models.

Findings

01

Smaller models outperform larger ones in detection accuracy.

02

Detection success is not heavily dependent on shared training data.

03

Smaller models achieve higher AUC scores across different generator models.

Abstract

With the advent of fluent generative language models that can produce convincing utterances very similar to those written by humans, distinguishing whether a piece of text is machine-generated or human-written becomes more challenging and more important, as such models could be used to spread misinformation, fake news, fake reviews and to mimic certain authors and figures. To this end, there have been a slew of methods proposed to detect machine-generated text. Most of these methods need access to the logits of the target model or need the ability to sample from the target. One such black-box detection method relies on the observation that generated text is locally optimal under the likelihood function of the generator, while human-written text is not. We find that overall, smaller and partially-trained models are better universal text detectors: they can more precisely detect text…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

loris3/evaluation_explanation_quality
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Cosine Annealing · Linear Layer · Adam · Linear Warmup With Cosine Annealing · Softmax · Layer Normalization · Byte Pair Encoding