The Polite Liar: Epistemic Pathology in Language Models
Bentley DeVilling (Course Correct Labs)

TL;DR
Large language models often confidently speak as if they know things they do not, due to reinforcement learning from human feedback prioritizing perceived sincerity over factual accuracy, revealing an epistemic pathology.
Contribution
This paper identifies and analyzes the structural cause of epistemic indifference in language models, proposing an epistemic alignment principle to prioritize justified confidence.
Findings
RLHF leads to models that maximize perceived sincerity rather than truth
Models exhibit conversational fluency as a virtue without epistemic grounding
The paper proposes an epistemic alignment principle for better model training
Abstract
Large language models exhibit a peculiar epistemic pathology: they speak as if they know, even when they do not. This paper argues that such confident fabrication, what I call the polite liar, is a structural consequence of reinforcement learning from human feedback (RLHF). Building on Frankfurt's analysis of bullshit as communicative indifference to truth, I show that this pathology is not deception but structural indifference: a reward architecture that optimizes for perceived sincerity over evidential accuracy. Current alignment methods reward models for being helpful, harmless, and polite, but not for being epistemically grounded. As a result, systems learn to maximize user satisfaction rather than truth, performing conversational fluency as a virtue. I analyze this behavior through the lenses of epistemic virtue theory, speech-act philosophy, and cognitive alignment, showing that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Embodied and Extended Cognition · Language and cultural evolution
