Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing

Shoumik Saha; Soheil Feizi

arXiv:2502.15666·cs.CL·May 6, 2025

Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing

Shoumik Saha, Soheil Feizi

PDF

1 Repo 1 Datasets 1 Video

TL;DR

This paper examines the difficulty of detecting AI-polished human text, revealing that current detectors often misclassify minimally refined content and struggle to assess AI involvement levels, indicating a need for improved detection methods.

Contribution

The study introduces the APT-Eval dataset and systematically evaluates twelve state-of-the-art AI-text detectors on AI-polished content, exposing their limitations.

Findings

01

Detectors often falsely flag minimally polished human text as AI-generated

02

Current detectors cannot reliably differentiate degrees of AI involvement

03

Biases exist against older and smaller AI models in detection accuracy

Abstract

The growing use of large language models (LLMs) for text generation has led to widespread concerns about AI-generated content detection. However, an overlooked challenge is AI-polished text, where human-written content undergoes subtle refinements using AI tools. This raises a critical question: should minimally polished text be classified as AI-generated? Such classification can lead to false plagiarism accusations and misleading claims about AI prevalence in online content. In this study, we systematically evaluate twelve state-of-the-art AI-text detectors using our AI-Polished-Text Evaluation (APT-Eval) dataset, which contains 14.7K samples refined at varying AI-involvement levels. Our findings reveal that detectors frequently flag even minimally polished text as AI-generated, struggle to differentiate between degrees of AI involvement, and exhibit biases against older and smaller…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ShoumikSaha/ai-polished-text
pytorchOfficial

Datasets

smksaha/apt-eval
dataset· 75 dl
75 dl

Videos

Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing· underline