Examining Linguistic Shifts in Academic Writing Before and After the Launch of ChatGPT: A Study on Preprint Papers
Tong Bao, Yi Zhao, Jin Mao, Chengzhi Zhang

TL;DR
This study analyzes linguistic changes in academic abstracts before and after ChatGPT's launch, revealing increased LLM-preferred words, lexical complexity, sentiment, and decreased syntactic complexity, cohesion, and readability, especially among less proficient English scholars.
Contribution
It provides a large-scale, systematic analysis of linguistic shifts in academic writing attributable to LLMs, highlighting discipline-specific differences and impacts on writing complexity.
Findings
Increase in LLM-preferred words in abstracts
Rise in lexical complexity and sentiment
Decrease in syntactic complexity, cohesion, and readability
Abstract
Large Language Models (LLMs), such as ChatGPT, have prompted academic concerns about their impact on academic writing. Existing studies have primarily examined LLM usage in academic writing through quantitative approaches, such as word frequency statistics and probability-based analyses. However, few have systematically examined the potential impact of LLMs on the linguistic characteristics of academic writing. To address this gap, we conducted a large-scale analysis across 823,798 abstracts published in last decade from arXiv dataset. Through the linguistic analysis of features such as the frequency of LLM-preferred words, lexical complexity, syntactic complexity, cohesion, readability and sentiment, the results indicate a significant increase in the proportion of LLM-preferred words in abstracts, revealing the widespread influence of LLMs on academic writing. Additionally, we observed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Text Readability and Simplification · Academic integrity and plagiarism
