Estimating the prevalence of LLM-assisted text in scholarly writing

Andrew Gray

arXiv:2512.01560·cs.DL·December 2, 2025

Estimating the prevalence of LLM-assisted text in scholarly writing

Andrew Gray

PDF

Open Access

TL;DR

This paper introduces a simple method to estimate the increasing use of large language models in scholarly writing, revealing that over 10% of papers in 2024 likely involved LLMs, raising concerns about research integrity.

Contribution

It provides a reproducible methodology to detect LLM involvement in research papers and highlights the urgent need for disclosure policies to maintain research integrity.

Findings

01

Over 10% of 2024 papers likely involved LLMs

02

Use of indicative words correlates with LLM involvement

03

Current disclosure practices are insufficient

Abstract

The use of large language models (LLMs) in scholarly publications has grown dramatically since the launch of ChatGPT in late 2022. This usage is often undisclosed, and it can be challenging for readers and reviewers to identify human written but LLM-revised or translated text, or predominantly LLM-generated text. Given the known quality and reliability issues connected with LLM-generated text, their potential growth poses an increasing problem for research integrity, and for public trust in research. This study presents a simple and easily reproducible methodology to show the growth in the full text of published papers, across the full range of research, as indexed in the Dimensions database. It uses this to demonstrate that LLM tools are likely to have been involved in the production of more than 10% of all published papers in 2024, based on disproportionate use of specific…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Academic integrity and plagiarism · Academic Publishing and Open Access