Does GenAI Rewrite How We Write? An Empirical Study on Two-Million Preprints
Minfeng Qi, Zhongmin Cao, Qin Wang, Ningran Li, Tianqing Zhu

TL;DR
This large-scale study investigates how generative AI models influence scientific preprints, revealing accelerated publication cycles, increased linguistic complexity, and disciplinary shifts, thereby shaping the future of scholarly communication.
Contribution
The paper provides the first comprehensive empirical analysis of LLMs' impact on preprint publishing across multiple repositories and disciplines.
Findings
LLMs accelerate submission and revision cycles
Linguistic complexity in preprints has modestly increased
AI-related topics have disproportionately expanded in research
Abstract
Preprint repositories become central infrastructures for scholarly communication. Their expansion transforms how research is circulated and evaluated before journal publication. Generative large language models (LLMs) introduce a further potential disruption by altering how manuscripts are written. While speculation abounds, systematic evidence of whether and how LLMs reshape scientific publishing remains limited. This paper addresses the gap through a large-scale analysis of more than 2.1 million preprints spanning 2016--2025 (115 months) across four major repositories (i.e., arXiv, bioRxiv, medRxiv, SocArXiv). We introduce a multi-level analytical framework that integrates interrupted time-series models, collaboration and productivity metrics, linguistic profiling, and topic modeling to assess changes in volume, authorship, style, and disciplinary orientation. Our findings reveal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAcademic Publishing and Open Access · scientometrics and bibliometrics research · Biomedical Text Mining and Ontologies
