Does GenAI Rewrite How We Write? An Empirical Study on Two-Million Preprints

Minfeng Qi; Zhongmin Cao; Qin Wang; Ningran Li; Tianqing Zhu

arXiv:2510.17882·cs.CY·October 22, 2025

Does GenAI Rewrite How We Write? An Empirical Study on Two-Million Preprints

Minfeng Qi, Zhongmin Cao, Qin Wang, Ningran Li, Tianqing Zhu

PDF

Open Access

TL;DR

This large-scale study investigates how generative AI models influence scientific preprints, revealing accelerated publication cycles, increased linguistic complexity, and disciplinary shifts, thereby shaping the future of scholarly communication.

Contribution

The paper provides the first comprehensive empirical analysis of LLMs' impact on preprint publishing across multiple repositories and disciplines.

Findings

01

LLMs accelerate submission and revision cycles

02

Linguistic complexity in preprints has modestly increased

03

AI-related topics have disproportionately expanded in research

Abstract

Preprint repositories become central infrastructures for scholarly communication. Their expansion transforms how research is circulated and evaluated before journal publication. Generative large language models (LLMs) introduce a further potential disruption by altering how manuscripts are written. While speculation abounds, systematic evidence of whether and how LLMs reshape scientific publishing remains limited. This paper addresses the gap through a large-scale analysis of more than 2.1 million preprints spanning 2016--2025 (115 months) across four major repositories (i.e., arXiv, bioRxiv, medRxiv, SocArXiv). We introduce a multi-level analytical framework that integrates interrupted time-series models, collaboration and productivity metrics, linguistic profiling, and topic modeling to assess changes in volume, authorship, style, and disciplinary orientation. Our findings reveal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAcademic Publishing and Open Access · scientometrics and bibliometrics research · Biomedical Text Mining and Ontologies