Model Misalignment and Language Change: Traces of AI-Associated Language in Unscripted Spoken English
Bryce Anderson, Riley Galpin, Tom S. Juzek

TL;DR
This study investigates whether recent increases in AI-associated words in spoken English reflect broader language change or AI influence, analyzing a large dataset before and after ChatGPT's release.
Contribution
It provides empirical evidence of increased usage of LLM-associated words in spoken language post-2022, highlighting potential early signs of language shift due to AI influence.
Findings
Significant rise in LLM-associated words after 2022
No significant change in baseline synonyms
Potential early indicator of language evolution
Abstract
In recent years, written language, particularly in science and education, has undergone remarkable shifts in word usage. These changes are widely attributed to the growing influence of Large Language Models (LLMs), which frequently rely on a distinct lexical style. Divergences between model output and target audience norms can be viewed as a form of misalignment. While these shifts are often linked to using Artificial Intelligence (AI) directly as a tool to generate text, it remains unclear whether the changes reflect broader changes in the human language system itself. To explore this question, we constructed a dataset of 22.1 million words from unscripted spoken language drawn from conversational science and technology podcasts. We analyzed lexical trends before and after ChatGPT's release in 2022, focusing on commonly LLM-associated words. Our results show a moderate yet significant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Computational and Text Analysis Methods · Text Readability and Simplification
