Saying More Than They Know: A Framework for Quantifying Epistemic-Rhetorical Miscalibration in Large Language Models
Asim D. Bakhshi

TL;DR
This paper introduces a framework to quantify how large language models' rhetorical expressions are disproportionate to their epistemic grounding, revealing systematic miscalibration across various texts.
Contribution
It proposes a novel triadic taxonomy and composite metrics to measure epistemic-rhetorical decoupling, enabling automated detection of LLM-generated content.
Findings
LLMs produce nearly twice the tricolon rate of experts.
Humans use erotema more than twice as often as LLMs.
LLM texts show higher form-meaning divergence and more uniform rhetorical device distribution.
Abstract
Large language models (LLMs) exhibit systematic miscalibration with rhetorical intensity not proportionate to epistemic grounding. This study tests this hypothesis and proposes a framework for quantifying this decoupling by designing a triadic epistemic-rhetorical marker (ERM) taxonomy. The taxonomy is operationalized through composite metrics of form-meaning divergence (FMD), genuine-to-performed epistemic ratio (GPR), and rhetorical device distribution entropy (RDDE). Applied to 225 argumentative texts spanning approximately 0.6 Million tokens across human expert, human non-expert, and LLM-generated sub-corpora, the framework identifies a consistent, model-agnostic LLM epistemic signature. LLM-generated texts produce tricolon at nearly twice the expert rate (), while human authors produce erotema at more than twice the LLM rate. Performed hesitancy markers appear at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
