AI Generated Text Detection Using Instruction Fine-tuned Large Language and Transformer-Based Models

Chinnappa Guggilla; Budhaditya Roy; Trupti Ramdas Chavan; Abdul Rahman; Edward Bowen

arXiv:2507.05157·cs.CL·July 8, 2025

AI Generated Text Detection Using Instruction Fine-tuned Large Language and Transformer-Based Models

Chinnappa Guggilla, Budhaditya Roy, Trupti Ramdas Chavan, Abdul Rahman, Edward Bowen

PDF

TL;DR

This paper develops detection methods for AI-generated text using instruction fine-tuned large language models and transformers, addressing challenges in distinguishing machine from human writing and identifying the source model.

Contribution

It introduces fine-tuning approaches for LLMs and transformers to improve detection accuracy of AI-generated texts and source identification.

Findings

01

Achieved 95.47% accuracy in distinguishing human vs. machine text.

02

Successfully identified the source LLM with 46.98% accuracy.

03

Demonstrated effectiveness of fine-tuned models in AI text detection.

Abstract

Large Language Models (LLMs) possess an extraordinary capability to produce text that is not only coherent and contextually relevant but also strikingly similar to human writing. They adapt to various styles and genres, producing content that is both grammatically correct and semantically meaningful. Recently, LLMs have been misused to create highly realistic phishing emails, spread fake news, generate code to automate cyber crime, and write fraudulent scientific articles. Additionally, in many real-world applications, the generated content including style and topic and the generator model are not known beforehand. The increasing prevalence and sophistication of artificial intelligence (AI)-generated texts have made their detection progressively more challenging. Various attempts have been made to distinguish machine-generated text from human-authored content using linguistic,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Warmup With Linear Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Dense Connections · Softmax · Transformer · Layer Normalization