Advancing LLM detection in the ALTA 2024 Shared Task: Techniques and   Analysis

Dima Galat

arXiv:2412.19076·cs.CL·December 30, 2024

Advancing LLM detection in the ALTA 2024 Shared Task: Techniques and Analysis

Dima Galat

PDF

Open Access

TL;DR

This paper investigates sentence-level detection techniques for AI-generated text, revealing that ChatGPT-3.5 Turbo exhibits unique probability patterns that facilitate reliable in-domain detection despite minor textual changes.

Contribution

It introduces analysis of sentence-level evaluation methods and highlights the effectiveness of probability pattern analysis for detecting ChatGPT-3.5 Turbo-generated content.

Findings

01

ChatGPT-3.5 Turbo shows distinct probability patterns for detection.

02

Minor rewording has little impact on detection accuracy.

03

Insights support development of robust AI detection methods.

Abstract

The recent proliferation of AI-generated content has prompted significant interest in developing reliable detection methods. This study explores techniques for identifying AI-generated text through sentence-level evaluation within hybrid articles. Our findings indicate that ChatGPT-3.5 Turbo exhibits distinct, repetitive probability patterns that enable consistent in-domain detection. Empirical tests show that minor textual modifications, such as rewording, have minimal impact on detection accuracy. These results provide valuable insights for advancing AI detection methodologies, offering a pathway toward robust solutions to address the complexities of synthetic text identification.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Service-Oriented Architecture and Web Services · Topic Modeling