Advancing LLM detection in the ALTA 2024 Shared Task: Techniques and Analysis
Dima Galat

TL;DR
This paper investigates sentence-level detection techniques for AI-generated text, revealing that ChatGPT-3.5 Turbo exhibits unique probability patterns that facilitate reliable in-domain detection despite minor textual changes.
Contribution
It introduces analysis of sentence-level evaluation methods and highlights the effectiveness of probability pattern analysis for detecting ChatGPT-3.5 Turbo-generated content.
Findings
ChatGPT-3.5 Turbo shows distinct probability patterns for detection.
Minor rewording has little impact on detection accuracy.
Insights support development of robust AI detection methods.
Abstract
The recent proliferation of AI-generated content has prompted significant interest in developing reliable detection methods. This study explores techniques for identifying AI-generated text through sentence-level evaluation within hybrid articles. Our findings indicate that ChatGPT-3.5 Turbo exhibits distinct, repetitive probability patterns that enable consistent in-domain detection. Empirical tests show that minor textual modifications, such as rewording, have minimal impact on detection accuracy. These results provide valuable insights for advancing AI detection methodologies, offering a pathway toward robust solutions to address the complexities of synthetic text identification.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Service-Oriented Architecture and Web Services · Topic Modeling
