Detecting AI-Generated Texts in Cross-Domains

You Zhou; Jie Wang

arXiv:2410.13966·cs.CL·October 21, 2024

Detecting AI-Generated Texts in Cross-Domains

You Zhou, Jie Wang

PDF

1 Repo

TL;DR

This paper introduces a domain-adaptive detection method for AI-generated texts using a fine-tuned RoBERTa-Ranker, outperforming existing tools across multiple domains with minimal labeled data.

Contribution

It proposes a novel fine-tuning approach for RoBERTa-Ranker that enhances cross-domain detection of AI-generated texts with limited labeled data.

Findings

01

Outperforms DetectGPT and GPTZero in cross-domain detection

02

Requires only small labeled datasets for effective fine-tuning

03

Enables a single system to detect AI texts across various domains

Abstract

Existing tools to detect text generated by a large language model (LLM) have met with certain success, but their performance can drop when dealing with texts in new domains. To tackle this issue, we train a ranking classifier called RoBERTa-Ranker, a modified version of RoBERTa, as a baseline model using a dataset we constructed that includes a wider variety of texts written by humans and generated by various LLMs. We then present a method to fine-tune RoBERTa-Ranker that requires only a small amount of labeled data in a new domain. Experiments show that this fine-tuned domain-aware model outperforms the popular DetectGPT and GPTZero on both in-domain and cross-domain texts, where AI-generated texts may either be in a different domain or generated by a different LLM not used to generate the training datasets. This approach makes it feasible and economical to build a single system to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zyloveslego/LLMCheck
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Dropout · Dense Connections · Layer Normalization · Residual Connection · Linear Warmup With Linear Decay · Weight Decay · Adam · Attention Dropout