Beating the Style Detector: Three Hours of Agentic Research on the AI-Text Arms Race
Andreas Maier, Moritz Zaiss, Siming Bayer

TL;DR
This study reproduces and extends an NLP experiment on style transfer, demonstrating that modern LLMs can significantly reduce detectable AI-generated text and highlighting an ongoing AI-text detection arms race.
Contribution
It provides a reproducible framework for evaluating style transfer and detection, showing LLMs' ability to evade detection with moderate effort and releasing all related code and data.
Findings
GPT-5.5 and Claude Opus 4.7 close 71-75% of the style gap to same-author ceiling.
Detection models achieve high AUC scores, but are confounded by length and style.
Adversarial feedback reduces detection margins and enables evasion.
Abstract
Reproducing an empirical NLP study used to take weeks. Given the released data and a modern agentic-research harness, we redo every experiment of a recent ACL\,2026 study on personal-style post-editing of LLM drafts -- and add three new ones -- with the human investigator acting only as a reviewer-in-the-loop. We reproduce all seven preregistered hypotheses and recover the paper's headline correlation between perceived self-similarity and embedding-measured self-similarity to three decimal places (, , ). Under a leakage-free held-out protocol, GPT-5.5 and Claude\,Opus\,4.7 close -- of the style gap to the same-author ceiling on paired tasks, against for the human post-edit, and beat the human post-edit on of tasks. We then frame the same data as an AI-text detection arms race. A leave-authors-out linear SVM on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
