Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense
Kalpesh Krishna, Yixiao Song, Marzena Karpinska, John Wieting, Mohit, Iyyer

TL;DR
Paraphrasing AI-generated text can effectively evade detection methods, but a retrieval-based approach can robustly identify such paraphrases, enhancing detection resilience.
Contribution
We develop DIPPER, a paraphrase model that evades detectors, and propose a retrieval-based defense that significantly improves detection robustness against paraphrasing attacks.
Findings
DIPPER reduces detection accuracy of several detectors from over 70% to below 5%.
Retrieval-based defense detects 80-97% of paraphrased texts with minimal false positives.
Our models, code, and data are open-sourced.
Abstract
The rise in malicious usage of large language models, such as fake content creation and academic plagiarism, has motivated the development of approaches that identify AI-generated text, including those based on watermarking or outlier detection. However, the robustness of these detection algorithms to paraphrases of AI-generated text remains unclear. To stress test these detectors, we build a 11B parameter paraphrase generation model (DIPPER) that can paraphrase paragraphs, condition on surrounding context, and control lexical diversity and content reordering. Using DIPPER to paraphrase text generated by three large language models (including GPT3.5-davinci-003) successfully evades several detectors, including watermarking, GPTZero, DetectGPT, and OpenAI's text classifier. For example, DIPPER drops detection accuracy of DetectGPT from 70.3% to 4.6% (at a constant false positive rate of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗kalpeshk2011/dipper-paraphraser-xxlmodel· 4.9k dl· ♡ 524.9k dl♡ 52
- 🤗kalpeshk2011/dipper-paraphraser-xxl-no-contextmodel· 15 dl15 dl
- 🤗SamSJackson/paraphrase-dipper-no-ctxmodel· 138 dl· ♡ 2138 dl♡ 2
- 🤗cPower/dipper-paraphraser-xxl-tokenincmodel· 3 dl3 dl
- 🤗gerasimovmaxim02/paraphrase-dipper-no-ctx-rebuiltmodel
- 🤗gerasimovmaxim02/dipper-rebuiltmodel· 2 dl2 dl
- 🤗fahmid/dipper-paraphraser-xxlmodel
Videos
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Authorship Attribution and Profiling
