Automatic Detection of Machine Generated Text: A Critical Survey

Ganesh Jawahar; Muhammad Abdul-Mageed; Laks V.S. Lakshmanan

arXiv:2011.01314·cs.CL·November 4, 2020

Automatic Detection of Machine Generated Text: A Critical Survey

Ganesh Jawahar, Muhammad Abdul-Mageed, Laks V.S. Lakshmanan

PDF

1 Repo

TL;DR

This paper critically surveys the rapidly growing field of detecting machine-generated text, analyzing current methods, challenges, and future research directions to combat misuse of text generative models.

Contribution

It provides the first comprehensive review and error analysis of existing detectors for machine-generated text, highlighting research gaps and guiding future efforts.

Findings

01

State-of-the-art detectors have significant error rates.

02

Existing methods struggle with generalization across models.

03

Future research should focus on robustness and interpretability.

Abstract

Text generative models (TGMs) excel in producing text that matches the style of human language reasonably well. Such TGMs can be misused by adversaries, e.g., by automatically generating fake news and fake product reviews that can look authentic and fool humans. Detectors that can distinguish text generated by TGM from human written text play a vital role in mitigating such misuse of TGMs. Recently, there has been a flurry of works from both natural language processing (NLP) and machine learning (ML) communities to build accurate detectors for English. Despite the importance of this problem, there is currently no work that surveys this fast-growing literature and introduces newcomers to important research challenges. In this work, we fill this void by providing a critical survey and review of this literature to facilitate a comprehensive understanding of this problem. We conduct an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

UBC-NLP/coling2020_machine_generated_text
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.