GPT Editors, Not Authors: The Stylistic Footprint of LLMs in Academic Preprints

Soren DeHaan; Yuanze Liu; Johan Bollen; Sa'ul A. Blanco

arXiv:2505.17327·cs.CL·May 26, 2025

GPT Editors, Not Authors: The Stylistic Footprint of LLMs in Academic Preprints

Soren DeHaan, Yuanze Liu, Johan Bollen, Sa'ul A. Blanco

PDF

TL;DR

This study investigates how large language models are used in academic preprints, finding that their usage is consistent and primarily for editing, which may mitigate hallucination risks in scholarly writing.

Contribution

It introduces a method to detect LLM influence in academic texts and reveals that LLMs are mainly used for editing rather than generating original content.

Findings

01

LLM influence is not predictive of stylistic segmentation.

02

Authors use LLMs uniformly, mainly for editing tasks.

03

Reduced risk of hallucinations in LLM-assisted academic writing.

Abstract

The proliferation of Large Language Models (LLMs) in late 2022 has impacted academic writing, threatening credibility, and causing institutional uncertainty. We seek to determine the degree to which LLMs are used to generate critical text as opposed to being used for editing, such as checking for grammar errors or inappropriate phrasing. In our study, we analyze arXiv papers for stylistic segmentation, which we measure by varying a PELT threshold against a Bayesian classifier trained on GPT-regenerated text. We find that LLM-attributed language is not predictive of stylistic segmentation, suggesting that when authors use LLMs, they do so uniformly, reducing the risk of hallucinations being introduced into academic preprints.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.