Do LLMs write like humans? Variation in grammatical and rhetorical styles

Alex Reinhart; Ben Markey; Michael Laudenbach; Kachatad Pantusen; Ronald Yurko; Gordon Weinberg; David West Brown

arXiv:2410.16107·cs.CL·August 25, 2025

Do LLMs write like humans? Variation in grammatical and rhetorical styles

Alex Reinhart, Ben Markey, Michael Laudenbach, Kachatad Pantusen, Ronald Yurko, Gordon Weinberg, David West Brown

PDF

Open Access 5 Datasets

TL;DR

This study investigates whether large language models (LLMs) mimic human rhetorical and grammatical styles, revealing systematic differences that persist across models and are more pronounced in instruction-tuned versions, highlighting challenges in replicating human stylistic variation.

Contribution

The paper introduces a novel analysis of LLMs' rhetorical and grammatical styles, demonstrating measurable differences from human writing using Douglas Biber's linguistic features.

Findings

01

Differences in stylistic features between LLMs and humans are systematic and measurable.

02

Larger and instruction-tuned models exhibit more pronounced stylistic differences.

03

Advanced linguistic features can effectively distinguish LLM output from human writing.

Abstract

Large language models (LLMs) are capable of writing grammatical text that follows instructions, answers questions, and solves problems. As they have advanced, it has become difficult to distinguish their output from human-written text. While past research has found some differences in surface features such as word choice and punctuation, and developed classifiers to detect LLM output, none has studied the rhetorical styles of LLMs. Using several variants of Llama 3 and GPT-4o, we construct two parallel corpora of human- and LLM-written texts from common prompts. Using Douglas Biber's set of lexical, grammatical, and rhetorical features, we identify systematic differences between LLMs and humans and between different LLMs. These differences persist when moving from smaller models to larger ones, and are larger for instruction-tuned models than base models. This observation of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSecond Language Acquisition and Learning · linguistics and terminology studies · Natural Language Processing Techniques

MethodsSparse Evolutionary Training · Balanced Selection · LLaMA