The Impact of Prompts on Zero-Shot Detection of AI-Generated Text
Kaito Taguchi, Yujie Gu, and Kouichi Sakurai

TL;DR
This paper investigates how prompts influence the accuracy of zero-shot detectors in identifying AI-generated text, revealing that prompt-aware detection significantly improves performance.
Contribution
It introduces an evaluative framework and empirical analysis demonstrating the impact of prompts on zero-shot detection accuracy, highlighting the importance of prompt information.
Findings
Prompt-aware detection improves AUC by at least 0.1 across tested detectors.
Zero-shot detectors are significantly affected by the presence or absence of prompts.
Prompt utilization enhances detection accuracy in practical applications.
Abstract
In recent years, there have been significant advancements in the development of Large Language Models (LLMs). While their practical applications are now widespread, their potential for misuse, such as generating fake news and committing plagiarism, has posed significant concerns. To address this issue, detectors have been developed to evaluate whether a given text is human-generated or AI-generated. Among others, zero-shot detectors stand out as effective approaches that do not require additional training data and are often likelihood-based. In chat-based applications, users commonly input prompts and utilize the AI-generated texts. However, zero-shot detectors typically analyze these texts in isolation, neglecting the impact of the original prompts. It is conceivable that this approach may lead to a discrepancy in likelihood assessments between the text generation phase and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Adversarial Robustness in Machine Learning
